Thanks to visit codestin.com
Credit goes to github.com

Skip to content

feat(optimize): flag low-worth expensive sessions#241

Closed
ozymandiashh wants to merge 1 commit into
getagentseal:mainfrom
ozymandiashh:feat/worth-it-score
Closed

feat(optimize): flag low-worth expensive sessions#241
ozymandiashh wants to merge 1 commit into
getagentseal:mainfrom
ozymandiashh:feat/worth-it-score

Conversation

@ozymandiashh

Copy link
Copy Markdown
Contributor

Summary

This adds a Worth-it Score style detector to codeburn optimize so expensive sessions with weak delivery signals are easy to review before they become a habit.

  • flags sessions above the spend floor when they have no edit turns, repeated retries, or edit turns with no one-shot success
  • treats git commit, git push, gh pr create, and gh pr merge as delivery signals, while ignoring dry-run commands and read-only git commands like git tag -l
  • keeps the detector conservative: findings are described as review candidates, not proof that a session was wasted
  • lists the most expensive candidates first, caps the preview, and rolls the finding into the existing optimize urgency/health scoring
  • documents the new detector in README and CHANGELOG

Detection model

The detector is intentionally heuristic and bounded:

  • sessions under $2 are ignored entirely
  • no-edit sessions need at least $3 spend before they are reported
  • retry-heavy sessions require at least 3 retries
  • edit sessions with zero one-shot turns require at least 2 retries
  • sessions with an observed delivery command are skipped so normal ship-work is not treated as suspect

For compatibility with older parsed data, the implementation reads aggregate categoryBreakdown data when present and falls back to raw turns when the aggregate is empty.

Validation

  • npx vitest run tests/optimize.test.ts
  • npm run build
  • npx vitest run
  • node dist/cli.js optimize --help
  • git diff --check

npx tsc --noEmit still fails on the existing Copilot provider type errors in src/providers/copilot.ts, outside this diff. The same failure is present on origin/main.

@ozymandiashh ozymandiashh marked this pull request as ready for review May 5, 2026 23:57
@iamtoruk

iamtoruk commented May 6, 2026

Copy link
Copy Markdown
Member

Superseded by #247, which preserves your original intent and integrates the review fixes cleanly on top of #246's dedup pattern.

Found via real-data probe (22.6K sessions / $4.8K spend):

  1. Triple-detector overlap. Top-5 sessions in low-worth, context-bloat, and outliers were the same 5 sessions. Same UX regression we just fixed in feat(optimize): detect context-heavy sessions #246. Fix: extended the excludedSessionIds pattern from feat(optimize): detect context-heavy sessions #246low-worth → context-bloat → outliers, each later detector excludes sessions named earlier.
  2. git commit-tree regex false positive. commit\b matches inside commit-tree, so sessions running plumbing commands were silently treated as having shipped. Fix: (?:\s|$|--) instead of \b after commit|push. git commit --amend still matches.
  3. Binary impact tier flattened all candidate counts. Same shape we already replaced in feat(optimize): detect context-heavy sessions #246. Fix: three real tiers (high ≥10 candidates or ≥$50 total · low ≤2 candidates AND <$10 · medium otherwise).
  4. 0.5 magic token-savings ratio. No rationale behind the 50%. Fix: two-regime model — no-edit sessions count full session tokens (no output produced); edit-with-retry sessions count only the retry fraction (retries / totalTurns × tokens). Lands the headline savings at ~17% of spend instead of 54% with full-session or 10% pre-low-worth.
  5. Fix-text near-duplicate of outlier detector. Both said "summarize plan, narrow scope, stop early." Fix: low-worth now focuses on naming a deliverable up front and capping retry attempts; outliers keeps its existing wording.

Plus 17 new tests for the detector (including commit-tree false-positive guard, three impact tiers, retry-fraction savings model) and a test for detectContextBloat honoring the new excludedSessionIds parameter.

Closing in favor of #247. The detector concept is solid — "weak delivery signal" catches a class of waste that cost outliers and context bloat both miss.

@iamtoruk iamtoruk closed this May 6, 2026
iamtoruk added a commit that referenced this pull request May 6, 2026
Adds a low-worth detector to codeburn optimize that flags expensive sessions with weak delivery signals (no edits, repeated retries, or no one-shot edits) when no git/gh delivery command is observed. Priority order is low-worth → context-bloat → outliers; each later detector excludes sessions named by an earlier one so the same session is never listed in three findings. Detection:  floor,  for no-edit, 3+ retries, regex matches git commit/push and gh pr create/merge but excludes commit-tree/commit-graph and dry-run. Three impact tiers consistent with #246. Token-savings uses full session tokens for no-edit sessions and the retry fraction for edit-with-retry sessions. Supersedes #241 with review fixes. Original implementation by @ozymandiashh.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants