feat(optimize): detect context-heavy sessions#246
Merged
Conversation
- Exclude sessions already flagged by detectContextBloat from the detectSessionOutliers preview. On real data the two findings shared most of their top-5 sessions; the outlier list now focuses on cost anomalies that are not also context-bloated. - Suppress the "Nx previous session input" growth callout when the previous session is more than 7 days back. Prevents alarming numbers like "1131x growth" that are actually artifacts of resuming a project after a long break, not bad context management. - Replace the binary high/medium impact tiering with three tiers: high at >=10 candidates or >=500K total effective tokens, low at <=2 candidates and <200K total, medium otherwise. Stops a single small finding from competing visually with a 300-session pile-up. - Tests added: medium-tier boundary, high tier at 10+ candidates, 1000+:1 cap with non-zero output, time-gap suppression, anchor growth from a below-threshold predecessor, outlier exclusion when a session is in the context-bloat exclusion set.
This was referenced May 6, 2026
iamtoruk
added a commit
that referenced
this pull request
May 6, 2026
Adds a low-worth detector to codeburn optimize that flags expensive sessions with weak delivery signals (no edits, repeated retries, or no one-shot edits) when no git/gh delivery command is observed. Priority order is low-worth → context-bloat → outliers; each later detector excludes sessions named by an earlier one so the same session is never listed in three findings. Detection: floor, for no-edit, 3+ retries, regex matches git commit/push and gh pr create/merge but excludes commit-tree/commit-graph and dry-run. Three impact tiers consistent with #246. Token-savings uses full session tokens for no-edit sessions and the retry fraction for edit-with-retry sessions. Supersedes #241 with review fixes. Original implementation by @ozymandiashh.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Supersedes #242 (cross-fork PR by @ozymandiashh — original commit preserved as the first commit here, follow-up commit applies review fixes).
Summary
Adds a
detectContextBloat()finding tocodeburn optimizethat flags sessions where effective input/cache tokens (cache-discounted via existing pricing constants) are large and disproportionate to output. Suggests starting fresh with a tightened context.Review fixes on top of #242
Adversarial review against real session data (22.6K sessions / $4.8K spend) found three concrete issues that this commit addresses:
Heavy overlap with
detectSessionOutliers. All 5 of the top-5 context-bloat sessions also appeared in the top-5 outlier list. Same sessions, two framings, two "potential savings" lines that the user mentally adds together. Fix:detectSessionOutliersnow accepts an optionalexcludedSessionIdsset;scanAndDetectruns context-bloat first, builds the candidate ID set, and passes it through. Real-data outlier count dropped from 96 → 19, and the two findings' top-5 lists are now disjoint. Headline "potential savings" went from a misleading 62% of spend to an honest 10%.Time-blind growth ratio. "1131x previous session input" was alarming on the surface but was sometimes just an artifact of resuming after a long gap (small test session → real working session weeks later). Fix: new
CONTEXT_BLOAT_GROWTH_MAX_GAP_MS = 7 days; growth ratio is suppressed when the predecessor is older than that.Binary impact tiers. Old logic was
>=3 candidates || >=500K total → high, else medium. A 300-session pile-up scored the same as a 3-session minor finding. Fix: three real tiers — high (≥10 candidates or ≥500K total), low (≤2 candidates AND <200K total), medium otherwise.Plus 7 added tests:
detectSessionOutliersskips sessions in the exclusion setdetectSessionOutliersstill flags cost outliers not in the exclusion setOne existing test updated: a single 93K-token session (1 candidate, <200K total) is now
lowimpact rather thanmedium.Validation
npx vitest run— 35 files, 498 tests pass (was 491 in feat(optimize): detect context-heavy sessions #242; +7 new)optimize -p 30days— confirmed dedup works, outlier list shrunk from 96 → 19, savings figures no longer double-countedSecurity
No new attack surface. Pure read-only data transform over existing parsed
ProjectSummary. No I/O, shell, eval, or external input.