Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 6e4db43

Browse files
committed
Release 0.8.0: model comparison, auto-refresh, menubar fix
Add compare command docs to README, update changelog for 0.8.0, bump version. TUI auto-refresh default 30s, menubar refresh 15s.
1 parent fc576f4 commit 6e4db43

3 files changed

Lines changed: 52 additions & 4 deletions

File tree

CHANGELOG.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,22 @@
22

33
## Unreleased
44

5+
## 0.8.0 - 2026-04-19
6+
7+
### Added
8+
- **`codeburn compare` command.** Side-by-side model comparison across any two models in your session data. Interactive model picker, period switching, and provider filtering.
9+
- **Compare view in dashboard.** Press `c` in the TUI to enter compare mode. Arrow keys switch periods, `b` to return.
10+
- **Performance metrics.** One-shot rate, retry rate, and self-correction detection per model. Self-corrections are detected by scanning JSONL transcripts for tool error followed by retry patterns.
11+
- **Efficiency metrics.** Cost per call, cost per edit turn, output tokens per call, and cache hit rate.
12+
- **Per-category one-shot rates.** Breaks down one-shot success by task category (Coding, Debugging, Feature Dev, etc.) for each model.
13+
- **Working style comparison.** Delegation rate, planning rate (TaskCreate, TaskUpdate, TodoWrite), average tools per turn, and fast mode usage.
14+
- **TUI auto-refresh enabled by default.** Dashboard now refreshes every 30 seconds out of the box. Pass `--refresh 0` to disable. Closes #107.
15+
- **36 comparison tests.** Full coverage for metric computation, category breakdown, working style, self-correction scanning, and planning tool detection. Total suite: 274 tests.
16+
17+
### Fixed
18+
- **Planning rate showed ~0% in model comparison.** Only counted `EnterPlanMode` (rarely used) instead of all planning tools (TaskCreate, TaskUpdate, TodoWrite, EnterPlanMode, ExitPlanMode). Now detects planning at the turn level across all five tool types.
19+
- **Menubar "All" tab showed stale data.** Three-layer caching (300s in-memory TTL, daily disk cache, 60s parser cache) prevented tab switches from showing fresh numbers. Cache TTL reduced from 300s to 30s, tab switches always fetch fresh data, background refresh interval reduced from 60s to 15s.
20+
521
## 0.7.4 - 2026-04-19
622

723
### Added

README.md

Lines changed: 35 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ codeburn report -p 30days # rolling 30-day window
5151
codeburn report -p all # every recorded session
5252
codeburn report --from 2026-04-01 --to 2026-04-10 # exact date range
5353
codeburn report --format json # full dashboard data as JSON
54-
codeburn report --refresh 60 # auto-refresh every 60 seconds
54+
codeburn report --refresh 60 # auto-refresh every 60s (default: 30s)
5555
codeburn status # compact one-liner (today + month)
5656
codeburn status --format json
5757
codeburn export # CSV with today, 7 days, 30 days
@@ -60,7 +60,7 @@ codeburn optimize # find waste, get copy-paste fixes
6060
codeburn optimize -p week # scope the scan to last 7 days
6161
```
6262

63-
Arrow keys switch between Today / 7 Days / 30 Days / Month / All Time. Press `q` to quit, `1` `2` `3` `4` `5` as shortcuts. The dashboard also shows average cost per session and the five most expensive sessions across all projects.
63+
Arrow keys switch between Today / 7 Days / 30 Days / Month / All Time. Press `q` to quit, `1` `2` `3` `4` `5` as shortcuts, `c` to open model comparison. The dashboard auto-refreshes every 30 seconds by default (`--refresh 0` to disable). The dashboard also shows average cost per session and the five most expensive sessions across all projects.
6464

6565
### JSON output
6666

@@ -176,7 +176,7 @@ The menu bar widget includes a currency picker with 17 common currencies. For an
176176
npx codeburn menubar
177177
```
178178

179-
One command: downloads the latest `.app`, installs into `~/Applications`, and launches it. Re-run with `--force` to reinstall. Native Swift + SwiftUI app lives in `mac/` (see `mac/README.md` for build details). Shows today's cost with a flame icon, opens a popover with agent tabs, period switcher (Today / 7 Days / 30 Days / Month / All), Trend / Forecast / Pulse / Stats / Plan insights, activity and model breakdowns, optimize findings, and CSV/JSON export. Refreshes live via FSEvents plus a 60-second poll.
179+
One command: downloads the latest `.app`, installs into `~/Applications`, and launches it. Re-run with `--force` to reinstall. Native Swift + SwiftUI app lives in `mac/` (see `mac/README.md` for build details). Shows today's cost with a flame icon, opens a popover with agent tabs, period switcher (Today / 7 Days / 30 Days / Month / All), Trend / Forecast / Pulse / Stats / Plan insights, activity and model breakdowns, optimize findings, and CSV/JSON export. Refreshes live via FSEvents plus a 15-second poll.
180180

181181
## What it tracks
182182

@@ -250,6 +250,37 @@ Each finding shows the estimated token and dollar savings plus a ready-to-paste
250250

251251
You can also open it inline from the dashboard: press `o` when a finding count appears in the status bar, `b` to return.
252252

253+
## Compare
254+
255+
Side-by-side model comparison across any two models in your session data. Pick any pair and see how they stack up on real usage from your own sessions.
256+
257+
```bash
258+
codeburn compare # interactive model picker (default: all time)
259+
codeburn compare -p week # last 7 days
260+
codeburn compare -p today # today only
261+
codeburn compare --provider claude # Claude Code sessions only
262+
```
263+
264+
Or press `c` in the dashboard to enter compare mode. Arrow keys switch periods, `b` to return.
265+
266+
**Metrics compared**
267+
268+
| Section | Metric | What it measures |
269+
|---------|--------|-----------------|
270+
| Performance | One-shot rate | Edits that succeed without retries |
271+
| Performance | Retry rate | Average retries per edit turn |
272+
| Performance | Self-correction | Turns where the model corrected its own mistake |
273+
| Efficiency | Cost / call | Average cost per API call |
274+
| Efficiency | Cost / edit | Average cost per edit turn |
275+
| Efficiency | Output tok / call | Average output tokens per call |
276+
| Efficiency | Cache hit rate | Proportion of input from cache |
277+
278+
**Per-category one-shot rates.** Breaks down one-shot success by task category (Coding, Debugging, Feature Dev, etc.) so you can see where each model excels or struggles.
279+
280+
**Working style.** Compares delegation rate (agent spawns), planning rate (TaskCreate, TaskUpdate, TodoWrite usage), average tools per turn, and fast mode usage.
281+
282+
All metrics are computed from your local session data. No LLM calls, fully deterministic.
283+
253284
## How it reads data
254285

255286
**Claude Code** stores session transcripts as JSONL at `~/.claude/projects/<sanitized-path>/<session-id>.jsonl`. Each assistant entry contains model name, token usage (input, output, cache read, cache write), tool_use blocks, and timestamps.
@@ -280,6 +311,7 @@ src/
280311
parser.ts JSONL reader, dedup, date filter, provider orchestration
281312
models.ts LiteLLM pricing, cost calculation
282313
classifier.ts 13-category task classifier
314+
compare-stats.ts Model comparison engine (metrics, category breakdown, working style)
283315
types.ts Type definitions
284316
format.ts Text rendering (status bar)
285317
menubar-json.ts Payload builder consumed by the native macOS menubar app in mac/

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "codeburn",
3-
"version": "0.7.4",
3+
"version": "0.8.0",
44
"description": "See where your AI coding tokens go - by task, tool, model, and project",
55
"type": "module",
66
"main": "./dist/cli.js",

0 commit comments

Comments
 (0)