Codestin Search App

ozymandiashh · 2026-05-05T23:56:22Z

Summary

This adds a Worth-it Score style detector to codeburn optimize so expensive sessions with weak delivery signals are easy to review before they become a habit.

flags sessions above the spend floor when they have no edit turns, repeated retries, or edit turns with no one-shot success
treats git commit, git push, gh pr create, and gh pr merge as delivery signals, while ignoring dry-run commands and read-only git commands like git tag -l
keeps the detector conservative: findings are described as review candidates, not proof that a session was wasted
lists the most expensive candidates first, caps the preview, and rolls the finding into the existing optimize urgency/health scoring
documents the new detector in README and CHANGELOG

Detection model

The detector is intentionally heuristic and bounded:

sessions under $2 are ignored entirely
no-edit sessions need at least $3 spend before they are reported
retry-heavy sessions require at least 3 retries
edit sessions with zero one-shot turns require at least 2 retries
sessions with an observed delivery command are skipped so normal ship-work is not treated as suspect

For compatibility with older parsed data, the implementation reads aggregate categoryBreakdown data when present and falls back to raw turns when the aggregate is empty.

Validation

npx vitest run tests/optimize.test.ts
npm run build
npx vitest run
node dist/cli.js optimize --help
git diff --check

npx tsc --noEmit still fails on the existing Copilot provider type errors in src/providers/copilot.ts, outside this diff. The same failure is present on origin/main.

iamtoruk · 2026-05-06T07:35:15Z

Superseded by #247, which preserves your original intent and integrates the review fixes cleanly on top of #246's dedup pattern.

Found via real-data probe (22.6K sessions / $4.8K spend):

Triple-detector overlap. Top-5 sessions in low-worth, context-bloat, and outliers were the same 5 sessions. Same UX regression we just fixed in feat(optimize): detect context-heavy sessions #246. Fix: extended the excludedSessionIds pattern from feat(optimize): detect context-heavy sessions #246 — low-worth → context-bloat → outliers, each later detector excludes sessions named earlier.
git commit-tree regex false positive. commit\b matches inside commit-tree, so sessions running plumbing commands were silently treated as having shipped. Fix: (?:\s|$|--) instead of \b after commit|push. git commit --amend still matches.
Binary impact tier flattened all candidate counts. Same shape we already replaced in feat(optimize): detect context-heavy sessions #246. Fix: three real tiers (high ≥10 candidates or ≥$50 total · low ≤2 candidates AND <$10 · medium otherwise).
0.5 magic token-savings ratio. No rationale behind the 50%. Fix: two-regime model — no-edit sessions count full session tokens (no output produced); edit-with-retry sessions count only the retry fraction (retries / totalTurns × tokens). Lands the headline savings at ~17% of spend instead of 54% with full-session or 10% pre-low-worth.
Fix-text near-duplicate of outlier detector. Both said "summarize plan, narrow scope, stop early." Fix: low-worth now focuses on naming a deliverable up front and capping retry attempts; outliers keeps its existing wording.

Plus 17 new tests for the detector (including commit-tree false-positive guard, three impact tiers, retry-fraction savings model) and a test for detectContextBloat honoring the new excludedSessionIds parameter.

Closing in favor of #247. The detector concept is solid — "weak delivery signal" catches a class of waste that cost outliers and context bloat both miss.

@ozymandiashh

Adds a low-worth detector to codeburn optimize that flags expensive sessions with weak delivery signals (no edits, repeated retries, or no one-shot edits) when no git/gh delivery command is observed. Priority order is low-worth → context-bloat → outliers; each later detector excludes sessions named by an earlier one so the same session is never listed in three findings. Detection: floor, for no-edit, 3+ retries, regex matches git commit/push and gh pr create/merge but excludes commit-tree/commit-graph and dry-run. Three impact tiers consistent with #246. Token-savings uses full session tokens for no-edit sessions and the retry fraction for edit-with-retry sessions. Supersedes #241 with review fixes. Original implementation by @ozymandiashh.

feat(optimize): flag low-worth expensive sessions

0efac7a

ozymandiashh marked this pull request as ready for review May 5, 2026 23:57

iamtoruk mentioned this pull request May 6, 2026

feat(optimize): flag low-worth expensive sessions #247

Merged

iamtoruk closed this May 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(optimize): flag low-worth expensive sessions#241

feat(optimize): flag low-worth expensive sessions#241
ozymandiashh wants to merge 1 commit into
getagentseal:mainfrom
ozymandiashh:feat/worth-it-score

ozymandiashh commented May 5, 2026

Uh oh!

iamtoruk commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ozymandiashh commented May 5, 2026

Summary

Detection model

Validation

Uh oh!

iamtoruk commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants