WIP: cache trace mvp for Anthropic #1370

parubets · 2026-01-21T09:19:29Z

Added a standalone cache tracing module and wired it into the embedded runner so you can capture message flow and the exact context sent to
Anthropic in a separate JSONL file.

What changed

New tracing module: src/agents/cache-trace.ts (self‑contained, env‑gated, writes JSONL, computes per‑message digests).
Hook points in src/agents/pi-embedded-runner/run/attempt.ts: logs stage snapshots (loaded/sanitized/limited/prompt/stream/after) and wraps the
stream fn to record the real context.messages at send time.

How to enable

CLAWDBOT_CACHE_TRACE=1 enables tracing.
CLAWDBOT_CACHE_TRACE_FILE=~/.clawdbot/logs/cache-trace.jsonl overrides output (default is
$CLAWDBOT_STATE_DIR/logs/cache-trace.jsonl).
Optional filters: - CLAWDBOT_CACHE_TRACE_MESSAGES=0 to omit full messages (still logs digests).
- CLAWDBOT_CACHE_TRACE_PROMPT=0 to omit prompt text.
- CLAWDBOT_CACHE_TRACE_SYSTEM=0 to omit system prompt.

What you’ll see

One JSON object per line with stage, messagesDigest, per‑message messageFingerprints, and the actual messages if enabled.
The most important line is stage: "stream:context" — that is the exact payload pi‑mono is sending. If this diverges from earlier stages, you’ve
found the mutation point.

parubets · 2026-01-21T09:22:14Z

Added an opt‑in “cache trace” debugger to capture exactly what context is sent to Anthropic on each inference. It’s a standalone module (src/
agents/cache-trace.ts) that writes JSONL to a separate file and records per‑stage snapshots (session loaded → sanitized → limited → prompt →
images → stream context → after). The stream wrapper logs the actual context.messages handed to pi‑ai, plus stable hashes and per‑message
fingerprints, so we can diff when/where history changes and explain prompt‑cache misses.

Enable via env:

CLAWDBOT_CACHE_TRACE=1
optional CLAWDBOT_CACHE_TRACE_FILE=... (default $CLAWDBOT_STATE_DIR/logs/cache-trace.jsonl)
optional CLAWDBOT_CACHE_TRACE_MESSAGES=0, ..._PROMPT=0, ..._SYSTEM=0 to reduce payloads.

…bets

…bets)

Added a standalone cache tracing module and wired it into the embedded runner so you can capture message flow and the exact context sent to Anthropic in a separate JSONL file. What changed - New tracing module: src/agents/cache-trace.ts (self‑contained, env‑gated, writes JSONL, computes per‑message digests). - Hook points in src/agents/pi-embedded-runner/run/attempt.ts: logs stage snapshots (loaded/sanitized/limited/prompt/stream/after) and wraps the stream fn to record the real context.messages at send time. How to enable - CLAWDBOT_CACHE_TRACE=1 enables tracing. - CLAWDBOT_CACHE_TRACE_FILE=~/.clawdbot/logs/cache-trace.jsonl overrides output (default is $CLAWDBOT_STATE_DIR/logs/cache-trace.jsonl). - Optional filters: - CLAWDBOT_CACHE_TRACE_MESSAGES=0 to omit full messages (still logs digests). - CLAWDBOT_CACHE_TRACE_PROMPT=0 to omit prompt text. - CLAWDBOT_CACHE_TRACE_SYSTEM=0 to omit system prompt. What you’ll see - One JSON object per line with stage, messagesDigest, per‑message messageFingerprints, and the actual messages if enabled. - The most important line is stage: "stream:context" — that is the exact payload pi‑mono is sending. If this diverges from earlier stages, you’ve found the mutation point.

…bets

…bets)

steipete · 2026-01-21T10:36:14Z

Landed via temp rebase onto main.\n\n- Gate: pnpm lint && pnpm build && pnpm test\n- Land commit: 97e8f9d\n- Merge commit: 86ddd3c\n\nThanks @parubets!

parubets · 2026-01-21T12:18:36Z

Here’s the full‑file analysis (all 67 lines, 9 runs), plus a Python script you can reuse.

Findings (from this trace only)

Within each run, session:loaded, session:sanitized, and session:limited are identical (no in‑run history mutations detected).
Every run shows a digest change at images -> stream (expected: the new user prompt is appended before send).
3 cross‑run mismatches where the next run’s session:loaded does not equal the previous run’s session:after:
- e2deedf1 -> 4b92a204
- 4b92a204 -> a2ee6008
- c6f73635 -> 5eb6af4b
  In each case, exactly one message fingerprint changes, and it’s always a toolResult for the read tool with the same toolCallId. This means
  the stored history was mutated between runs (likely by pruning/sanitization or re‑serialization of tool results). That is enough to
  invalidate Anthropic prompt cache and trigger new cache writes.

Cache behavior timeline (from session:after usage)

b32a...: cacheRead 0, cacheWrite 12596 (new cache)
109a...: cacheRead 0, cacheWrite 12963 (new cache)
79fc...: cacheRead 12963, cacheWrite 182 (cache hit + small append)
e2de...: cacheRead 13938, cacheWrite 194 (cache hit + small append)
4b92...: cacheRead 14316, cacheWrite 9624 (big rewrite; history mismatch vs previous run)
a2ee...: cacheRead 23940, cacheWrite 479 (hit + small append; history mismatch vs previous run)
c6f7...: cacheRead 25321, cacheWrite 744 (hit + small append)
5eb6...: cacheRead 0, cacheWrite 26649 (full miss; history mismatch vs previous run)
5958...: cacheRead 0, cacheWrite 27872 (full miss; history did match prev run, so miss is likely due to non‑history prefix changes: system/
steering/hidden messages, or cache TTL/metadata)

Interpretation

The cache rewrites you see do correlate with actual history mutation between runs, and those mutations are localized to toolResult read entries.
One full cache miss (5958...) happens without a history mismatch, so there is likely another moving input not captured in this trace (system/
steering messages, hidden prompts, or API behavior).

parubets · 2026-01-21T12:19:36Z

based on the cache‑trace you gave me, the only history changes between runs are on toolResult entries for the read tool, and those
changes happen between runs (persist/load), not inside the Clawdbot run pipeline. That points to pi‑mono’s SessionManager persistence/reload path,
not Clawdbot’s in‑run sanitizers.

Why I’m confident:

For every run, session:loaded, session:sanitized, session:limited are identical → no Clawdbot mutation during the run.
The three cross‑run mismatches are always a single toolResult (toolName read, same toolCallId), with identical text content.
That means the difference is in non‑text, likely non‑serialized metadata (e.g., details or undefined fields) that JSONL persistence drops. That
happens in pi‑mono (SessionManager uses JSON.stringify).

So: not clawdbot logic, it’s pi‑mono session save/load shape drift. It likely doesn’t change the actual Anthropic payload (tool result text is the
same), but it does change our cache‑trace fingerprint.

If you want to be 100% sure, I can:

Patch cache‑trace to hash the JSON‑serialized message (so undefined fields don’t cause false diffs), and
Add a pi‑mono hook to hash pre‑persist vs post‑reload entries to prove persistence‑only changes.

parubets · 2026-01-21T12:34:40Z

Root cause (confirmed)

ToolResult messages in pi‑mono are created with a details key even when it’s undefined (in packages/agent/src/agent-loop.ts).
The read tool in packages/coding-agent/src/core/tools/read.ts usually returns details as undefined (only set when truncation occurs), so most
read tool results carry details: undefined in memory.
Session persistence in packages/coding-agent/src/core/session-manager.ts uses JSON.stringify, which drops undefined properties. So details:
undefined disappears on disk. When the session reloads, the toolResult has no details field at all.
If your fingerprinting/hashing treats “missing property” vs “present but undefined” as different, you get exactly the “same toolCallId/content
but different fingerprint” symptom, and it shows up most often for read tool results.

Fields dropped on disk

toolResult.details when it’s undefined (this is the dominant culprit).
Potentially other optional fields across messages, but the concrete offender in your trace is details for read tool results.

Fix options

Normalize at construction (best local fix)
- Only include details in the ToolResultMessage if it’s defined.
- Example: ...(details === undefined ? {} : { details }).
- This keeps in‑memory and persisted objects identical.
Normalize before persist (systemic fix)
- In SessionManager, deep‑prune undefined fields before JSON.stringify.
- Pros: fixes all message types consistently. Cons: more work and larger surface area.
Normalize before hashing (Clawdbot-side fix)
- Canonicalize messages by removing undefined keys or by JSON‑serialize/parse before hashing.
- Fixes fingerprint diffs without touching pi‑mono persistence.

Net: the mutation is not semantic, it’s a serialization artifact (undefined vs missing).

steipete added a commit to parubets/clawdbot that referenced this pull request Jan 21, 2026

fix: add diagnostics cache trace config (openclaw#1370) (thanks @paru…

5704b33

…bets)

Andrii and others added 2 commits January 21, 2026 10:23

fix: add diagnostics cache trace config (moltbot#1370) (thanks @parubets

97e8f9d

…bets)

steipete force-pushed the fix-debug-ttl-cache branch from 5704b33 to 97e8f9d Compare January 21, 2026 10:35

steipete merged commit 86ddd3c into openclaw:main Jan 21, 2026
18 of 22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: cache trace mvp for Anthropic #1370

WIP: cache trace mvp for Anthropic #1370

Uh oh!

parubets commented Jan 21, 2026

Uh oh!

parubets commented Jan 21, 2026

Uh oh!

Uh oh!

steipete commented Jan 21, 2026

Uh oh!

parubets commented Jan 21, 2026

Uh oh!

parubets commented Jan 21, 2026

Uh oh!

parubets commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

WIP: cache trace mvp for Anthropic #1370

WIP: cache trace mvp for Anthropic #1370

Uh oh!

Conversation

parubets commented Jan 21, 2026

Uh oh!

parubets commented Jan 21, 2026

Uh oh!

Uh oh!

steipete commented Jan 21, 2026

Uh oh!

parubets commented Jan 21, 2026

Uh oh!

parubets commented Jan 21, 2026

Uh oh!

parubets commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants