semgrep_scan MCP tool fails with `Cannot create auto config when metrics are off` when `config=None` and `SEMGREP_METRICS=off`

## Summary

The `semgrep_scan` MCP tool fails on Semgrep 1.146.0+ whenever the subprocess environment has metrics disabled (`SEMGREP_METRICS=off` / `SEMGREP_SEND_METRICS=off`) **and** no `SEMGREP_APP_TOKEN` is set. The helper `get_semgrep_scan_args(temp_dir, config=None)` at `cli/src/semgrep/mcp/server.py:252-270` omits `--config` when the caller passes no explicit rule source. Semgrep then falls back to `--config auto`, and `auto` requires metrics to be enabled. The subprocess exits with code 2 and the MCP tool returns `isError: true` with the metrics-refusal message in its text.

This bug was previously filed as #11644 with an incorrect root-cause analysis (blamed on `--x-mcp` flag rejection). This issue replaces #11644 with the correct diagnosis — empirical testing shows `--x-mcp` is still a valid registered flag (documented `[INTERNAL]` in `semgrep scan --help`, wired in both `cli/src/semgrep/commands/scan.py:700` and `src/osemgrep/cli_scan/Scan_CLI.ml:1112-1114`) and is orthogonal to the failure.

## Exit-code evidence

Run inside the official `mekayelanik/semgrep-mcp-server:latest` image (Semgrep 1.159.0, `SEMGREP_METRICS=off`, OSS mode, no token):

| Variant | `--x-mcp` | `--config` | Metrics | Exit | Notes |
|---------|:---------:|:-----------|:-------:|:----:|:------|
| A | yes | omitted | off | **2** | `ERROR: Cannot create auto config when metrics are off` |
| B | yes | omitted | **on** | **0** | `auto` works when metrics are on |
| C | yes | `p/default` | off | **0** | explicit config bypasses the metrics gate |
| **D** | **no** | **omitted** | **off** | **2** | **same ERROR with no `--x-` WARNING at all** — proves `--x-mcp` is not the cause |
| E | yes | `p/default` | off | **0** | `--x-mcp` alone never causes a non-zero exit |

Reproduction (shell, inside a container with Semgrep 1.159.0):

```sh
mkdir -p /tmp/d && printf "def hello():\n    print(1)\n" > /tmp/d/code.py
export SEMGREP_METRICS=off
export SEMGREP_SEND_METRICS=off
# Variant A (the failure)
semgrep scan --json --experimental --x-mcp /tmp/d
echo "exit=$?"
# => exit=2, stderr: "Cannot create auto config when metrics are off"
# Variant D (strip --x-mcp — same failure)
semgrep scan --json --experimental /tmp/d
echo "exit=$?"
# => exit=2, same ERROR
```

## MCP end-to-end reproduction

```sh
docker run --rm -i --entrypoint /bin/sh mekayelanik/semgrep-mcp-server:latest -c '
  printf "def hello():\n    print(1)\n" > /tmp/d.py
  (
    echo "{\"jsonrpc\":\"2.0\",\"id\":0,\"method\":\"initialize\",\"params\":{\"protocolVersion\":\"2024-11-05\",\"capabilities\":{},\"clientInfo\":{\"name\":\"t\",\"version\":\"1\"}}}"
    echo "{\"jsonrpc\":\"2.0\",\"method\":\"notifications/initialized\"}"
    echo "{\"jsonrpc\":\"2.0\",\"id\":1,\"method\":\"tools/call\",\"params\":{\"name\":\"semgrep_scan\",\"arguments\":{\"code_files\":[{\"path\":\"/tmp/d.py\"}]}}}"
    sleep 2
  ) | semgrep mcp -t stdio
'
```

Server-side stderr shows:

```
semgrep_scan_cli failed: Error running semgrep: (2) [WARNING]: ... options starting with '--x-' ...
[ERROR]: Cannot create auto config when metrics are off. Please allow metrics or run with a specific config.
```

The `(2)` is `process.returncode`; the `[WARNING]` is stderr-cosmetic OCaml output from `semgrep-core`; the `[ERROR]` line is the actual failure trigger.

## Root cause

`cli/src/semgrep/mcp/server.py:252` — `get_semgrep_scan_args(temp_dir, config=None)`:

```python
args = ["scan", "--json", "--experimental"]
args.extend(["--x-mcp"])
if config:
    args.extend(["--config", config])
args.append(temp_dir)
return args
```

When `config is None`, no `--config` is added, so Semgrep applies its default of `--config auto`. On 1.146.0+, `auto` is refused when `SEMGREP_METRICS=off` (which every packaged MCP image sets by default). The MCP-side callers at `server.py:958` (`semgrep_scan`) and `server.py:759` (`semgrep_scan_with_custom_rule` only when `rule_file_path` is `None`, which does not happen in practice because the custom-rule tool always has a YAML) pass `config=None`, so `semgrep_scan` fails 100% of the time in OSS-metrics-off mode.

`semgrep_scan_with_custom_rule` is unaffected because its caller always supplies `config=<path to rule.yaml>`.

`semgrep_scan_supply_chain` is unaffected because it passes `--config supply-chain` explicitly.

## Suggested fix — policy options for upstream to choose

1. **Prefer `SEMGREP_RULES`**: in `get_semgrep_scan_args`, if `config is None`, read `os.environ.get("SEMGREP_RULES")` and apply each token as a separate `--config` (matches the behavior of the Click `-f/--config` option that is already `envvar="SEMGREP_RULES"`, `multiple=True`). Fall through to `auto` only if `SEMGREP_RULES` is unset. This makes `semgrep_scan` work offline-by-default for image operators who bake in `SEMGREP_RULES=p/default` and does not commit upstream to any policy change around `auto`.
2. **Surface a clearer error**: if `config is None` + metrics disabled + no token, raise `McpError` at the MCP layer with a user-facing message pointing at `SEMGREP_RULES` or `SEMGREP_APP_TOKEN`, instead of propagating semgrep-core's generic error text.
3. **MCP-origin-implies-metrics-on**: not recommended (silently re-enables metrics users explicitly disabled), but listed for completeness.

Option (1) is the smallest, composes cleanly with the existing multi-ruleset path, and is a one-file change plus a test. Happy to open a PR against `develop` if the semgrep team picks (1); holding back because the policy call is yours.

## Environment

- Semgrep 1.159.0 (latest PyPI as of 2026-04-19); also reproduced against `develop` HEAD (`d921f451`, Release 1.160.0).
- `semgrep mcp -t stdio` — same failure through supergateway (`/mcp` SHTTP or `/sse`) or direct stdio.
- OSS mode, no `SEMGREP_APP_TOKEN`, `SEMGREP_METRICS=off` (default in packaged MCP images for privacy).

## Tools working in the same configuration

- `semgrep_scan_with_custom_rule(code_files, rule)` — passes `--config <rule.yaml>`, unaffected.
- `semgrep_scan_supply_chain()` — hard-coded `--config supply-chain`, unaffected (modulo the separate Pro Engine / `roots` callback requirements).

## Workaround for users today

- Set `SEMGREP_APP_TOKEN` → `semgrep_scan` uses the logged-in org policy path (different code path from `auto`), succeeds.
- Or set `SEMGREP_METRICS=on` (privacy trade-off) → `auto` works.
- Or use `semgrep_scan_with_custom_rule` with a bundled YAML (`rules:` list with N rules).
- Or exec the CLI directly out-of-band: `docker exec <container> semgrep scan --config p/default --json /code`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

semgrep_scan MCP tool fails with `Cannot create auto config when metrics are off` when `config=None` and `SEMGREP_METRICS=off` #11649

Summary

Exit-code evidence

MCP end-to-end reproduction

Root cause

Suggested fix — policy options for upstream to choose

Environment

Tools working in the same configuration

Workaround for users today

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Variant	`--x-mcp`	`--config`	Metrics	Exit	Notes
A	yes	omitted	off	2	`ERROR: Cannot create auto config when metrics are off`
B	yes	omitted	on	0	`auto` works when metrics are on
C	yes	`p/default`	off	0	explicit config bypasses the metrics gate
D	no	omitted	off	2	same ERROR with no `--x-` WARNING at all — proves `--x-mcp` is not the cause
E	yes	`p/default`	off	0	`--x-mcp` alone never causes a non-zero exit

semgrep_scan MCP tool fails with Cannot create auto config when metrics are off when config=None and SEMGREP_METRICS=off #11649

Description

Summary

Exit-code evidence

MCP end-to-end reproduction

Root cause

Suggested fix — policy options for upstream to choose

Environment

Tools working in the same configuration

Workaround for users today

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

semgrep_scan MCP tool fails with `Cannot create auto config when metrics are off` when `config=None` and `SEMGREP_METRICS=off` #11649