Thanks to visit codestin.com
Credit goes to github.com

Skip to content

fix(py-gov-compliance): resolve lint line numbers via YAML AST#2183

Merged
imran-siddique merged 1 commit into
microsoft:mainfrom
aegis-initiative:fix/py-gov-compliance-lint-ast-line-mapping
May 12, 2026
Merged

fix(py-gov-compliance): resolve lint line numbers via YAML AST#2183
imran-siddique merged 1 commit into
microsoft:mainfrom
aegis-initiative:fix/py-gov-compliance-lint-ast-line-mapping

Conversation

@finnoybu

Copy link
Copy Markdown
Contributor

Summary

lint_policy.py's _find_line() walked the raw text looking for a needle substring (old + ":", str(operator), rule_name). That returned a best-effort approximation that broke in predictable ways:

Scenario Wrong behaviour
Condition has both deprecated op: and canonical operator: Value-based searches (e.g. _find_line(lines, "nope")) hit earlier mentions of the literal needle in other fields, strings, or YAML anchors
Two rules each carry a deprecated type: key _find_line returns the FIRST occurrence, so the second rule's warning points at the first rule's line
A comment contains the literal key text (# legacy stub: op: eq) Substring search matches the comment line before reaching the structural key

The bullet calls this out as "op: collides with operator:" — the collision is more general (any value-based substring search collides with strings/comments/anchors that happen to contain the needle), and the AST-based fix resolves the whole class.

Change

Adds a _LineMap helper in lint_policy.py that walks pyyaml's compose() AST once at lint-file entry and caches per-key source lines:

  • top_key_line(key) — top-level keys (e.g. rules:)
  • rule_line(idx) — start of the idx-th rule
  • rule_key_line(idx, key) — keys inside a rule (action:, priority:, deprecated type: / op:)
  • condition_key_line(idx, branch, cond_idx, key) — keys inside conditions (per branch: condition vs conditions[i])

Every line-resolution call site in lint_file and _lint_rules now consults the AST instead of grepping raw text. _find_line is removed.

No new dependency — pyyaml is already required, and its compose() returns nodes carrying start_mark.line (1-based after the off-by-one fix). The bullet originally suggested ruamel.yaml; that would have added a heavyweight transitive dep for a YAML feature pyyaml already supports. Documented the choice in the _LineMap docstring.

Tests

tests/test_lint_policy.py gains a TestAstLineResolution class with seven cases:

Test Pins
test_deprecated_op_in_condition_points_at_op_not_operator op: on line 8 reported, not the adjacent operator: line
test_unknown_operator_points_at_operator_value_line "nope" used as a value earlier doesn't poison the operator: nope lookup
test_deprecated_type_in_rule_targets_correct_rule Two type: keys produce two distinct line numbers, not two copies of the first
test_unknown_action_points_at_action_key action: zap line is reported
test_invalid_priority_targets_priority_key priority: key line, not a value-side match
test_empty_rules_warning_points_at_rules_key rules: key line
test_substring_collision_in_comment_ignored A # legacy stub: op: eq comment doesn't capture the lookup

All 37 pre-existing test_lint_policy.py cases continue to pass.

$ PYTHONPATH=src python -m pytest tests/test_lint_policy.py -q
44 passed in 0.68s

Full agent-compliance suite: 476 passed, 1 pre-existing failure in test_red_team_cli unrelated to this change.

Test plan

  • CI passes
  • All 44 test_lint_policy.py cases pass
  • No regression in existing TestLintFileDeprecatedFields / TestLintFileInvalidAction / TestLintFileInvalidPriority cases
  • CLI output (agent-compliance lint-policy) still surfaces <file>:<line>: headers in the human format

Surfaced during independent audit conducted by @finnoybu (Ken Tannenbaum, AEGIS Initiative); [LOW, Python Governance].

lint_policy.py's _find_line() walked the raw text looking for a needle
substring. That returned a best-effort approximation that broke in
predictable ways:

  * Deprecated `op:` inside a condition that also has `operator:` —
    even though `op:` isn't a literal substring of `operator:`, value
    searches (`_find_line(lines, "nope")` for an unknown operator
    value) would hit any earlier mention of "nope" in any field or
    string.
  * Duplicate deprecated keys across multiple rules — substring
    search returns the FIRST occurrence, so a second rule's `type:`
    warning pointed at the first rule's line.
  * Comments containing the literal key text (e.g.
    `# legacy stub: op: eq`) attracted the line lookup before the
    structural key two lines below.

Adds a _LineMap that walks pyyaml's compose() AST once and caches
per-key source lines (`top_key_line`, `rule_line`, `rule_key_line`,
`condition_key_line`). All line-resolution call sites in lint_file /
_lint_rules now consult the AST rather than grepping raw text. No new
dependency — pyyaml is already required.

Verified: PYTHONPATH=src python -m pytest tests/test_lint_policy.py
-q -> 44 passed (37 pre-existing + 7 new TestAstLineResolution cases).
Full agent-compliance suite: 476 passed, 1 pre-existing failure in
test_red_team_cli unrelated to this change.
@github-actions github-actions Bot added the tests label May 12, 2026
@github-actions

Copy link
Copy Markdown
🤖 AI Agent: security-scanner — View details

No security issues found.

@github-actions

Copy link
Copy Markdown
🤖 AI Agent: docs-sync-checker — Docs Sync

Docs Sync

  • _LineMap in lint_policy.py -- missing docstring
  • README.md -- section on linting policies needs update
  • CHANGELOG.md -- missing entry for behavioral changes in line resolution for deprecated fields and rules

@github-actions github-actions Bot added the size/L Large PR (< 500 lines) label May 12, 2026
@github-actions

Copy link
Copy Markdown
🤖 AI Agent: breaking-change-detector — API Compatibility

API Compatibility

Severity Change Impact
Breaking Removed _find_line() function Any existing code relying on this function for line number resolution will break.
Breaking Changed line resolution mechanism from substring search to AST-based lookup Changes in line numbers reported for deprecated fields and errors may affect users relying on specific line numbers for debugging or reporting.

@github-actions

Copy link
Copy Markdown
🤖 AI Agent: code-reviewer — View details

TL;DR: 0 blockers, 0 warnings. No issues found. Clean change.

@github-actions

Copy link
Copy Markdown
🤖 AI Agent: test-generator — `lint_policy.py`

lint_policy.py

  • test_deprecated_op_in_condition_points_at_op_not_operator -- Validates that the deprecated op: key points to its correct line, not operator:.
  • test_unknown_operator_points_at_operator_value_line -- Ensures that an unknown operator points to the correct line, avoiding false positives from earlier values.
  • test_deprecated_type_in_rule_targets_correct_rule -- Confirms that each type: key in rules returns its distinct line number.
  • test_unknown_action_points_at_action_key -- Checks that an unknown action correctly reports the line of the action: key.
  • test_invalid_priority_targets_priority_key -- Validates that an invalid priority reports the line of the priority: key.
  • test_empty_rules_warning_points_at_rules_key -- Ensures that a warning for empty rules points to the rules: key line.
  • test_substring_collision_in_comment_ignored -- Validates that comments do not interfere with locating the correct line for the op: key.

@github-actions

Copy link
Copy Markdown

🟡 Contributor Check: MEDIUM

Check Result
Profile MEDIUM
Credential NONE
Overall MEDIUM

Automated check by AGT Contributor Check.

@github-actions github-actions Bot added the needs-review:MEDIUM Contributor check flagged MEDIUM risk label May 12, 2026
@github-actions

Copy link
Copy Markdown

PR Review Summary

Check Status Details
🔍 Code Review ✅ Passed No issues found
🛡️ Security Scan ✅ Passed No issues found
🔄 Breaking Changes ✅ Completed Analysis complete
📝 Docs Sync ✅ Completed Analysis complete
🧪 Test Coverage ✅ Completed Analysis complete

Verdict: ✅ Ready for human review

@imran-siddique imran-siddique merged commit 4536d2b into microsoft:main May 12, 2026
13 of 14 checks passed
MohammadHaroonAbuomar pushed a commit to MohammadHaroonAbuomar/agt-acs that referenced this pull request Jun 1, 2026
…soft#2183)

lint_policy.py's _find_line() walked the raw text looking for a needle
substring. That returned a best-effort approximation that broke in
predictable ways:

  * Deprecated `op:` inside a condition that also has `operator:` —
    even though `op:` isn't a literal substring of `operator:`, value
    searches (`_find_line(lines, "nope")` for an unknown operator
    value) would hit any earlier mention of "nope" in any field or
    string.
  * Duplicate deprecated keys across multiple rules — substring
    search returns the FIRST occurrence, so a second rule's `type:`
    warning pointed at the first rule's line.
  * Comments containing the literal key text (e.g.
    `# legacy stub: op: eq`) attracted the line lookup before the
    structural key two lines below.

Adds a _LineMap that walks pyyaml's compose() AST once and caches
per-key source lines (`top_key_line`, `rule_line`, `rule_key_line`,
`condition_key_line`). All line-resolution call sites in lint_file /
_lint_rules now consult the AST rather than grepping raw text. No new
dependency — pyyaml is already required.

Verified: PYTHONPATH=src python -m pytest tests/test_lint_policy.py
-q -> 44 passed (37 pre-existing + 7 new TestAstLineResolution cases).
Full agent-compliance suite: 476 passed, 1 pre-existing failure in
test_red_team_cli unrelated to this change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-review:MEDIUM Contributor check flagged MEDIUM risk size/L Large PR (< 500 lines) tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants