feat(testing): close the testing gap in ce:work, ce:plan, and testing-reviewer#438
Conversation
…-reviewer Make "no tests" a deliberate decision rather than an accidental omission across three layers: - ce:plan: blank test scenarios on feature-bearing units are now flagged as incomplete; units that genuinely need no tests use an explicit annotation - ce:work/ce:work-beta: execution loop includes per-task testing deliberation; quality checklist updated from "Tests pass" to "Testing addressed" - testing-reviewer: new check flags behavioral changes with zero test additions Ships with contract tests verifying each change (practice what we preach). Sync decision: Propagated to beta — shared testing deliberation guidance. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 13fe5e02d2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Feature-bearing units must have actual test scenarios — the `Test expectation: none` annotation is only valid for non-feature-bearing units (config, scaffolding, styling). Previous wording allowed feature-bearing units to use the annotation as an escape hatch. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
|
@codex review |
|
Codex Review: Didn't find any major issues. 🎉 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Summary
Makes "no tests" a deliberate decision rather than an accidental omission. Targets three layers with focused edits:
Test expectation: none -- [reason]annotation.Ships with 5 contract tests verifying each change — practice what we preach.
Motivation
External feedback pointed out an irony: ce:work teaches agents to discover and write tests, but the quality gate says only "All tests pass" — vacuously true when no tests exist. The gap was that "no tests" could be a deliberate decision or an accidental omission, and the skill didn't distinguish between the two.
Rather than introducing a new "testing assessment" abstraction (which would be a self-reported prose artifact), this takes a layered approach: specific deliberation prompts at the point of action (ce:work), preventive annotation at planning time (ce:plan), and detective checks on the actual diff (testing-reviewer).
Test plan
5 new contract tests across 2 test files, following the existing string-assertion pattern:
All 511 tests pass.
Sync decision: Propagated to beta — shared testing deliberation guidance, not experimental delegate-mode behavior.
🤖 Generated with Claude Opus 4.6 (1M context, extended thinking) via Claude Code