Describe the bug
YAML anchors/aliases (e.g. & and *) are invalid when using the --validate flag, but appear to correctly detect findings when running on live code (despite logging semgrep-core rule validation failed (RuleParseError)).
To Reproduce
Hard to repro on semgrep.dev, so here's a local example.
Semgrep rule using YAML anchors/aliases (& and *):
$ cat rule.yaml
rules:
- id: yaml-anchor-bug
message: found bad
languages: [yaml]
severity: WARNING
pattern-either:
- patterns: &anchor
- pattern: "key: $X"
- metavariable-regex:
metavariable: $X
regex: bad
- patterns:
- pattern: |
data:
...
$KEY: $VALUE
- metavariable-pattern:
metavariable: $VALUE
language: yaml
patterns: *anchor
Rule test:
$ cat rule.test.yaml
# ruleid: yaml-anchor-bug
key: bad
---
# ok: yaml-anchor-bug
key: good
---
data:
config.yaml: |
# ruleid: yaml-anchor-bug
key: bad
---
data:
config.yaml: |
# ok: yaml-anchor-bug
key: good
--validate fails:
$ semgrep --validate -c rule.yaml
Configuration is invalid - found 1 configuration error(s), and 1 rule(s).
[ERROR] Rule parse error in rule yaml-anchor-bug:
Expected a list for patterns
Running the rule produces the correct findings, but includes a validation error message:
$ semgrep -c rule.yaml rule.test.yaml
...
⠙ Loading rules...semgrep-core rule validation failed (RuleParseError)
⠹ Loading rules... Scanning 1 file (only git-tracked) with 1 Code rule:
...
rule.test.yaml
❯❱ yaml-anchor-bug
❰❰ Blocking ❱❱
found bad
2┆ key: bad
⋮┆----------------------------------------
7┆ data:
8┆ config.yaml: |
9┆ # ruleid: yaml-anchor-bug
10┆ key: bad
Ran 1 rule on 1 file: 2 findings.
Note the semgrep-core rule validation failed (RuleParseError). It seems like there are different parsers being used for validation and running of the YAML rules.
Expected behavior
I guess the primary desire here is for aliases/anchors to work, which they partially do. However, the secondary desire would be consistency between --validate and running the rules.
Screenshots
If applicable, add screenshots to help explain your problem.
What is the priority of the bug to you?
Environment
Official binary
Use case
What will fixing this bug enable for you?
Describe the bug
YAML anchors/aliases (e.g.
&and*) are invalid when using the--validateflag, but appear to correctly detect findings when running on live code (despite loggingsemgrep-core rule validation failed (RuleParseError)).To Reproduce
Hard to repro on semgrep.dev, so here's a local example.
Semgrep rule using YAML anchors/aliases (
&and*):Rule test:
--validatefails:Running the rule produces the correct findings, but includes a validation error message:
Note the
semgrep-core rule validation failed (RuleParseError). It seems like there are different parsers being used for validation and running of the YAML rules.Expected behavior
I guess the primary desire here is for aliases/anchors to work, which they partially do. However, the secondary desire would be consistency between
--validateand running the rules.Screenshots
If applicable, add screenshots to help explain your problem.
What is the priority of the bug to you?
Environment
Official binary
Use case
What will fixing this bug enable for you?