Thanks to visit codestin.com
Credit goes to github.com

Skip to content

feat(testing): close the testing gap in ce:work, ce:plan, and testing-reviewer#438

Merged
tmchow merged 2 commits into
mainfrom
feat/testing-addressed-gate
Mar 29, 2026
Merged

feat(testing): close the testing gap in ce:work, ce:plan, and testing-reviewer#438
tmchow merged 2 commits into
mainfrom
feat/testing-addressed-gate

Conversation

@tmchow
Copy link
Copy Markdown
Collaborator

@tmchow tmchow commented Mar 29, 2026

Summary

Makes "no tests" a deliberate decision rather than an accidental omission. Targets three layers with focused edits:

  • ce:plan: Blank test scenarios on feature-bearing units are flagged as incomplete during Phase 5.1 review. Units that genuinely need no tests use an explicit Test expectation: none -- [reason] annotation.
  • ce:work / ce:work-beta: Execution loop includes per-task testing deliberation ("did this task change behavior? were tests addressed?"). Quality Checklist and Final Validation updated from "Tests pass" to "Testing addressed."
  • testing-reviewer: New 5th check flags behavioral code changes (new branches, state mutations, API changes) with zero corresponding test additions in the diff.

Ships with 5 contract tests verifying each change — practice what we preach.

Motivation

External feedback pointed out an irony: ce:work teaches agents to discover and write tests, but the quality gate says only "All tests pass" — vacuously true when no tests exist. The gap was that "no tests" could be a deliberate decision or an accidental omission, and the skill didn't distinguish between the two.

Rather than introducing a new "testing assessment" abstraction (which would be a self-reported prose artifact), this takes a layered approach: specific deliberation prompts at the point of action (ce:work), preventive annotation at planning time (ce:plan), and detective checks on the actual diff (testing-reviewer).

Test plan

5 new contract tests across 2 test files, following the existing string-assertion pattern:

  • ce:work execution loop includes testing deliberation in correct position
  • ce:work Quality Checklist and Final Validation use "Testing addressed" (negative assertions confirm old language removed)
  • ce:work-beta mirrors all changes identically
  • ce:plan Phase 5.1 addresses blank test scenarios on feature-bearing units
  • testing-reviewer includes the behavioral-changes-with-no-test-additions check

All 511 tests pass.

Sync decision: Propagated to beta — shared testing deliberation guidance, not experimental delegate-mode behavior.


Compound Engineering v2.59.0
🤖 Generated with Claude Opus 4.6 (1M context, extended thinking) via Claude Code

…-reviewer

Make "no tests" a deliberate decision rather than an accidental omission
across three layers:

- ce:plan: blank test scenarios on feature-bearing units are now flagged as
  incomplete; units that genuinely need no tests use an explicit annotation
- ce:work/ce:work-beta: execution loop includes per-task testing deliberation;
  quality checklist updated from "Tests pass" to "Testing addressed"
- testing-reviewer: new check flags behavioral changes with zero test additions

Ships with contract tests verifying each change (practice what we preach).

Sync decision: Propagated to beta — shared testing deliberation guidance.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 13fe5e02d2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-plan/SKILL.md Outdated
Feature-bearing units must have actual test scenarios — the
`Test expectation: none` annotation is only valid for non-feature-bearing
units (config, scaffolding, styling). Previous wording allowed
feature-bearing units to use the annotation as an escape hatch.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@tmchow
Copy link
Copy Markdown
Collaborator Author

tmchow commented Mar 29, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@tmchow tmchow merged commit 35678b8 into main Mar 29, 2026
2 checks passed
@github-actions github-actions Bot mentioned this pull request Mar 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant