Thanks to visit codestin.com
Credit goes to github.com

Skip to content

YAML anchors/aliases in Semgrep rules are (partially?) broken #11654

@mschwager

Description

@mschwager

Describe the bug

YAML anchors/aliases (e.g. & and *) are invalid when using the --validate flag, but appear to correctly detect findings when running on live code (despite logging semgrep-core rule validation failed (RuleParseError)).

To Reproduce

Hard to repro on semgrep.dev, so here's a local example.

Semgrep rule using YAML anchors/aliases (& and *):

$ cat rule.yaml 
rules:
  - id: yaml-anchor-bug
    message: found bad
    languages: [yaml]
    severity: WARNING
    pattern-either:
      - patterns: &anchor
          - pattern: "key: $X"
          - metavariable-regex:
              metavariable: $X
              regex: bad
      - patterns:
          - pattern: |
              data:
                ...
                $KEY: $VALUE
          - metavariable-pattern:
              metavariable: $VALUE
              language: yaml
              patterns: *anchor

Rule test:

$ cat rule.test.yaml 
# ruleid: yaml-anchor-bug
key: bad
---
# ok: yaml-anchor-bug
key: good
---
data:
  config.yaml: |
    # ruleid: yaml-anchor-bug
    key: bad
---
data:
  config.yaml: |
    # ok: yaml-anchor-bug
    key: good

--validate fails:

$ semgrep --validate -c rule.yaml
Configuration is invalid - found 1 configuration error(s), and 1 rule(s).
[ERROR] Rule parse error in rule yaml-anchor-bug:
 Expected a list for patterns

Running the rule produces the correct findings, but includes a validation error message:

$ semgrep -c rule.yaml rule.test.yaml 
...
⠙ Loading rules...semgrep-core rule validation failed (RuleParseError)
⠹ Loading rules...                                                                                       Scanning 1 file (only git-tracked) with 1 Code rule:
...
    rule.test.yaml
    ❯❱ yaml-anchor-bug
          ❰❰ Blocking ❱❱
          found bad
                   
            2┆ key: bad
            ⋮┆----------------------------------------
            7┆ data:
            8┆   config.yaml: |
            9┆     # ruleid: yaml-anchor-bug
           10┆     key: bad

Ran 1 rule on 1 file: 2 findings.

Note the semgrep-core rule validation failed (RuleParseError). It seems like there are different parsers being used for validation and running of the YAML rules.

Expected behavior

I guess the primary desire here is for aliases/anchors to work, which they partially do. However, the secondary desire would be consistency between --validate and running the rules.

Screenshots
If applicable, add screenshots to help explain your problem.

What is the priority of the bug to you?

  • P0: blocking your adoption of Semgrep or workflow
  • P1: important to fix or quite annoying
  • P2: regular bug that should get fixed

Environment

Official binary

Use case
What will fixing this bug enable for you?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingparsingRequires a fix in a parser, typically a tree-sitter or menhir grammar.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions