Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ci(mythos-auto): scope each Tier-5 scan to the PR diff, not the whole file#181

Merged
avrabe merged 1 commit into
mainfrom
fix/mythos-auto-diff-scan
May 24, 2026
Merged

ci(mythos-auto): scope each Tier-5 scan to the PR diff, not the whole file#181
avrabe merged 1 commit into
mainfrom
fix/mythos-auto-diff-scan

Conversation

@avrabe

@avrabe avrabe commented May 24, 2026

Copy link
Copy Markdown
Contributor

Summary

Move mythos-auto from whole-file scanning to PR-diff-scoped scanning. Eliminates the v0.9–v0.10 "whole-file treadmill" where every parser.rs / fact.rs / resolver.rs PR re-triggered every latent canonical-ABI bug in the touched file — across #178 and #179 the auto-runner surfaced 4+ findings in successive parser.rs re-scans, one of which (the claimed inversion of LS-P-8 against canonical-abi.py::record_size) was a flat false positive.

What changes

  1. scan job's actions/checkout now uses fetch-depth: 0 — both base.sha and head.sha need to be reachable for the diff.
  2. New "Extract PR diff for ${matrix.file}" step writes git diff --no-color BASE...HEAD -- $F to mythos-diffs/$SLUG.diff. Triple-dot uses the merge-base.
  3. The claude-code-action prompt now references the diff via steps.diff.outputs.diff_path / diff_size and instructs the AI to report only findings introduced by the diff. Pre-existing bugs in unchanged regions are explicitly out of scope; full-file context remains readable for caller/callee understanding.

An empty diff (rename / mode / pure delete) yields NO_FINDINGS by construction — no skip logic needed.

Why now

After v0.10.0's 15-fix canonical-ABI sweep, latent bugs in parser.rs are likely exhausted, but the gate's design caused real grief: PRs were blocked on every latent bug in any Tier-5 file they touched, even when their own changes were elsewhere. The LS-N verification gate continues to pin every approved loss scenario, so latent-bug regressions remain caught at PR time — they're just no longer other PRs' problem.

Test plan

  • CI on this PR exercises the new Extract PR diff step (this PR touches .github/workflows/mythos-auto.yml, which is NOT a Tier-5 path, so the scan job won't actually fire — confirms the detect-job filter still excludes workflow files cleanly).
  • Next Tier-5 PR (e.g. the upcoming LS-P-12 structural fix) exercises the diff-scoped scan end-to-end.

🤖 Generated with Claude Code

… file

The whole-file scan that landed with v0.9.0 (#162, #164, #170, #173,
#175) caused a treadmill across v0.10.0's #178 and #179: every
parser.rs / fact.rs / resolver.rs PR re-triggered every latent
canonical-ABI bug in the touched file, regardless of whether the PR
went near that code. PR #179 surfaced 4+ findings in successive
re-scans of parser.rs alone; each fix exposed the next, and one
finding (the auto-runner's claimed inversion of LS-P-8 against
canonical-abi.py::record_size) was an outright false positive.

This commit moves the scan to a diff-scoped model:

  1. The scan job's actions/checkout step now uses fetch-depth=0
     so both base.sha and head.sha are reachable.

  2. A new "Extract PR diff for ${matrix.file}" step writes
     `git diff --no-color BASE...HEAD -- $F` to a workspace file
     under mythos-diffs/. Triple-dot uses the merge-base so commits
     the base branch advanced past after PR open do not show up.

  3. The discover prompt now references the diff file by path
     (diff_path / diff_size step outputs) and tells the AI to
     report only findings *introduced* by the diff. Pre-existing
     bugs in unchanged regions are explicitly out of scope —
     they can be filed against main in their own dedicated PR.
     Full-file context remains readable for caller/callee
     understanding.

An empty diff (rename / mode / pure delete) is allowed — the AI
sees no introduced changes and reports NO_FINDINGS by construction;
no skip logic required at the workflow level.

Unblocks future Tier-5 PRs from being judged on bugs they did not
introduce. Latent bugs in the unchanged file body remain the
project's problem to fix proactively (the LS-N gate continues to
pin every approved scenario), but they no longer block unrelated
PRs from merging.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
@github-actions

Copy link
Copy Markdown

LS-N verification gate

33/33 approved LS entries verified

count
Passed (≥1 test, all green) 33
Failed (≥1 test failure) 0
Missing (no ls_*_NN_* test found) 0

Approved loss-scenarios.yaml entries are expected to have a
regression test named ls_<letter>_<num>_* (e.g. LS-A-11
ls_a_11_*). The gate runs each prefix via cargo test --lib --no-fail-fast and aggregates pass/fail/missing.

Failed LS entries

(none)

Missing regression tests

(none)

Updated automatically by tools/post_verification_comment.py.
Source of truth: safety/stpa/loss-scenarios.yaml.

@avrabe avrabe merged commit 85527b1 into main May 24, 2026
16 of 18 checks passed
@avrabe avrabe deleted the fix/mythos-auto-diff-scan branch May 24, 2026 15:22
avrabe added a commit that referenced this pull request May 24, 2026
Spec-compliant UTF transcoding + working list<conditional-pointer>
adapters + a process fix that closes the v0.9–v0.10 whole-file-scan
treadmill. 3 PRs since v0.10.0:

## Headline

- **#181 — mythos-auto diff-scoped scanning.** Each Tier-5 PR is now
  judged on the diff it introduces, not on every latent bug in the
  whole file. The scan step computes
  `git diff --no-color BASE...HEAD -- $F` per touched file, writes
  it to `mythos-diffs/`, and instructs the AI to report only
  findings *introduced* by the diff. v0.10's whole-file scan
  surfaced 4+ pre-existing findings on every parser.rs PR plus a
  reproducible false positive (the auto-runner's claimed inversion
  of LS-P-8 against `canonical-abi.py::record_size`); diff-scoped
  scanning eliminates that whole class of churn while preserving
  the LS-N gate's coverage of approved scenarios.

- **#182 — UTF transcoders emit U+FFFD on malformed input.** v0.10
  closed two cross-memory leaks by trapping (`unreachable`):
  LS-P-16 (UTF-16→UTF-8 lone high surrogate at end-of-input would
  have read 2 bytes past the buffer) and LS-P-19 (UTF-8→UTF-16
  truncated multi-byte lead would have read 1–3 bytes past). The
  trap was conservative but spec-incorrect; the Canonical ABI
  mandates lossy replacement. All four sites (LS-P-16 ×1,
  LS-P-19 ×3 for the 2/3/4-byte branches) now substitute U+FFFD
  (3-byte UTF-8 `EF BF BD` / single UTF-16 BMP code unit) and
  consume only the lead, falling through to the existing
  encoder. Fused adapters produce spec-compliant output for
  malformed UTF instead of refusing to run.

- **#183 — LS-P-12 + LS-P-18 structural per-element conditional
  pointer fixup.** v0.10 panicked at adapter generation when an
  outer list's element type contained a pointer-bearing
  option/result/variant payload (e.g. `list<option<string>>`,
  `list<result<string, E>>`, `list<variant{ …(string)… }>`, plus
  the LS-P-18 mixed-record bypass). v0.11 ships the proper fix:
  `CopyLayout::Elements.inner_pointers` becomes `Vec<InnerPointer>`
  carrying a `guards: Vec<DiscriminantGuard>` chain per
  descriptor; `element_inner_pointers` recurses into
  Option/Result/Variant payloads threading the enclosing
  discriminant onto the chain; the FACT adapter's per-element
  fixup AND-evaluates each guard at the per-element base + offset
  before firing realloc + memory.copy + ptr-rewrite, wrapping the
  body in an `If` that fires only when every enclosing arm
  holds. Common WIT shapes (`list<option<string>>` for nullable
  string lists, etc.) now generate correct cross-component
  adapters.

## Test coverage

- 266 lib tests
- LS-N verification gate: 33/33 approved scenarios
- mythos-auto delta-pass: clean diff-scoped per-file scans on
  parser.rs / fact.rs / resolver.rs
- LS-P-12 / LS-P-18 regressions upgraded from `#[should_panic]`
  to structural assertions that pin the guarded `InnerPointer`
  shape (option-Some / Result-Ok / mixed-record both-pointers
  cases).

Co-authored-by: Claude Opus 4.7 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant