verify-plan skill for Claude Code

Post-implementation cross-check skill for Claude Code /plan. Verifies that a /plan was fully implemented by comparing plan items against actual code changes. Catches phantom completions (tasks marked complete that were never done), dead code, and missing tests, then interactively walks you through gaps and offers to fix them.

Note

Works on committed and uncommitted work. Run before or after committing.
Complements code review but doesn't replace it. Code review checks whether the code that was written is correct. This skill checks whether the code that was supposed to be written actually exists. See Phantom Completions in AI-Assisted Development for the theory behind why this gap occurs.
Suggested to run this before /simplify, /refactor, or similar commands. If there's a gap between the plan and the current code, those commands may remove or restructure code that appears unused or dead, but is only that way because the wiring is missing. Close the phantom completions and implementation gaps first, then simplify.

Installation

Clone into the global skills folder

mkdir -p ~/.claude/skills
git clone https://github.com/datastone-inc/verify-plan-skill ~/.claude/skills/verify-plan

This installs the skill globally, available in all your projects. That covers most people.

Project-scoped install (niche): If you need the skill scoped to a single repo, avoid cloning into it, as that creates a nested git repo which Git handles poorly. Copy from the global install instead:

mkdir -p .claude/skills
cp -r ~/.claude/skills/verify-plan .claude/skills/verify-plan

Note: the copy won't receive updates via git pull. You'll need to re-copy after updating the global install.

Restart Claude Code or start a new session.

Keeping it up-to-date

cd ~/.claude/skills/verify-plan && git pull

Requirements

Python 3.10+
Git

No pip dependencies. Uses only the Python standard library.

Usage

In Claude Code:

# Auto-discover most recent plan, review all changes (committed + uncommitted) since it was written
/verify-plan

# Specify a plan file
/verify-plan .claude/plans/my-plan.md

# Review only uncommitted work
/verify-plan --scope uncommitted

# Review committed changes only vs a branch
/verify-plan --scope branch --base develop

# Review everything (committed + uncommitted) vs main
/verify-plan --scope all

# List available plans
/verify-plan --list

# Or just say it naturally
> was the plan actually followed?
> check if the plan was implemented
> I haven't committed yet, did I cover the plan?

# You can also run the skill with ad-hoc AI augmentation, e.g.:
> /verify-plan list all the plans in .claude/plans in reverse chronological order with a brief summary title. run the verify-plan skill on the ones I select in parallel.

Scopes

Scope	What it diffs	When to use
`plan` (default)	Changes since the plan was last modified, including uncommitted	Best general-purpose option
`branch`	Committed changes only: `base..HEAD`	Clean committed-only view
`uncommitted`	Staged + unstaged vs HEAD	Just finished implementing, haven't committed
`all`	Committed + uncommitted vs base branch	Complete picture

CLI Reference

Direct script usage (for automation or debugging):

usage: review.py [-h] [--base BASE] [--scope {branch,plan,uncommitted,all}]
                 [--repo REPO] [--output OUTPUT] [--json] [--list]
                 [plan_file]

Audit whether a Claude Code /plan was fully implemented.

positional arguments:
  plan_file             Path to plan markdown file. If omitted, discovers the
                        most recent plan from CC settings or default
                        locations.

options:
  -h, --help            show this help message and exit
  --base BASE           Git ref to diff against (default: main)
  --scope {branch,plan,uncommitted,all}
                        What changes to review: branch=committed vs base
                        branch, plan=changes since plan was created/updated
                        (default), uncommitted=only staged+unstaged vs HEAD,
                        all=committed+uncommitted vs base branch
  --repo REPO           Repository root (default: current directory)
  --output OUTPUT       Output file path (default: PLAN_REVIEW.md in repo
                        root)
  --json                Output raw JSON results instead of markdown
  --list                List available plans and exit

Examples:

# Run directly from command line
python3 scripts/review.py examples/sample-plan.md --repo .

# Review uncommitted work only
python3 scripts/review.py --scope uncommitted

# Compare against develop branch instead of main
python3 scripts/review.py --base develop

# Output JSON for further processing
python3 scripts/review.py --json > results.json

# List all available plans
python3 scripts/review.py --list

Interactive flow

Summary: scorecard and per-Change overview
Walkthrough: one Change group at a time, explaining gaps, assessing severity, flagging false positives
Fix: implement missing pieces in place when asked
Re-verify: re-run the audit after fixes to confirm

Claude pauses at each step for your input. You choose what to fix, skip, or stop.

Language support

Pattern extraction works across multiple languages, detected automatically from file extensions and code fences:

TypeScript/JS, Python, Rust, Go, Java/Kotlin, C/C++, C#, Ruby, Swift, SQL

To add a new language, add an entry to scripts/languages.py. No other code changes needed. See the existing entries for the pattern format.

How it works

Parse: reads the plan markdown and extracts verifiable items (types, functions, fields, filters, tests) using language-aware pattern extraction
Evidence: gathers changes according to the chosen scope (branch diff, plan-anchored, uncommitted, or all)
Cross-reference: checks each item against the diff; language-aware dead-code detection finds declared-but-unused symbols. This cross-reference step is what catches phantom completions, where the agent marked a plan item complete without performing the specified work.
Interactive: Claude presents findings, walks through gaps, and offers to fix them

Architecture

The skill operates as a 4-stage pipeline. Each stage is independent and testable:

┌──────────────┐    ┌──────────────┐    ┌──────────────┐    ┌──────────────┐
│   Parse      │ -> │   Evidence   │ -> │     Cross    │ -> │ Interactive  │
│   Plan       │    │   Gather     │    │  Reference   │    │   Review     │
└──────────────┘    └──────────────┘    └──────────────┘    └──────────────┘

Stage 1: Parse Plan (`scripts/parse_plan.py`)

Reads a Claude Code /plan markdown file and extracts verifiable items:

Recognizes ## Change N: headings to group related work
Extracts code blocks and applies language-specific regex patterns from scripts/languages.py
Finds inline code mentions in prose (e.g., "Update the handleRequest function")
Categorizes items: type_definition, function, field, test, filter_logic, wiring
Outputs structured JSON: {id, change_id, change_title, file_pattern, expected_patterns, category}

Stage 2: Evidence Gather (`scripts/gather_evidence.py`)

Collects git diff and file contents according to scope:

Scopes:
- plan (default): Changes since plan file was last modified
- branch: Committed changes only (base..HEAD)
- uncommitted: Staged + unstaged vs HEAD
- all: Committed + uncommitted vs base
Parses unified diffs by file
Reads current file contents for pattern searching
Handles exact path and basename matching

Stage 3: Cross-Reference (`scripts/cross_reference.py`)

Matches plan patterns against diff evidence:

Evidence levels:
- ✅ IN_DIFF: All patterns found in added diff lines
- 🔍 MIXED: Some in diff, others pre-existing
- ⚠️ PRE_EXISTING: Pattern exists but not in diff
- ❌ NOT_FOUND: Not found anywhere
- ⏭️ SKIPPED: No mechanically verifiable patterns
Dead-code detection:
- Type definitions: searches for references elsewhere
- Functions: searches for calls using language-specific call_pattern
- Fields: searches for assignments and reads using access_pattern
Generates markdown report with evidence table and dead-code signals

Stage 4: Interactive Review (orchestrated by Claude Code)

Claude Code reads the generated report and walks through findings with you:

Presents summary scorecard and per-Change overview
Reviews gaps one Change at a time, explaining evidence and severity
Offers to implement missing pieces when you confirm
Re-runs the audit after fixes to verify completion

The scripts provide evidence; Claude Code makes the verdicts and suggests fixes.

Contributing

Contributions welcome! See CONTRIBUTING.md for detailed guidelines. In particular, you may want to focus on adding language support or improving the cross-reference logic.

For bug reports and feature requests, please open an issue.

Authors

Dave Sharpe ([email protected]) at dataStone Inc.: concept, design, and development

Claude (Anthropic): co-developed the implementation, scripts, and multi-language support via Claude Code

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
examples		examples
scripts		scripts
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
PLAN_REVIEW.md		PLAN_REVIEW.md
README.md		README.md
SKILL.md		SKILL.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

verify-plan skill for Claude Code

Installation

Clone into the global skills folder

Keeping it up-to-date

Requirements

Usage

Scopes

CLI Reference

Interactive flow

Language support

How it works

Architecture

Stage 1: Parse Plan (`scripts/parse_plan.py`)

Stage 2: Evidence Gather (`scripts/gather_evidence.py`)

Stage 3: Cross-Reference (`scripts/cross_reference.py`)

Stage 4: Interactive Review (orchestrated by Claude Code)

Contributing

Authors

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

verify-plan skill for Claude Code

Installation

Clone into the global skills folder

Keeping it up-to-date

Requirements

Usage

Scopes

CLI Reference

Interactive flow

Language support

How it works

Architecture

Stage 1: Parse Plan (scripts/parse_plan.py)

Stage 2: Evidence Gather (scripts/gather_evidence.py)

Stage 3: Cross-Reference (scripts/cross_reference.py)

Stage 4: Interactive Review (orchestrated by Claude Code)

Contributing

Authors

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Stage 1: Parse Plan (`scripts/parse_plan.py`)

Stage 2: Evidence Gather (`scripts/gather_evidence.py`)

Stage 3: Cross-Reference (`scripts/cross_reference.py`)

Packages