Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@parubets
Copy link

Added a standalone cache tracing module and wired it into the embedded runner so you can capture message flow and the exact context sent to
Anthropic in a separate JSONL file.

What changed

  • New tracing module: src/agents/cache-trace.ts (self‑contained, env‑gated, writes JSONL, computes per‑message digests).
  • Hook points in src/agents/pi-embedded-runner/run/attempt.ts: logs stage snapshots (loaded/sanitized/limited/prompt/stream/after) and wraps the
    stream fn to record the real context.messages at send time.

How to enable

  • CLAWDBOT_CACHE_TRACE=1 enables tracing.
  • CLAWDBOT_CACHE_TRACE_FILE=~/.clawdbot/logs/cache-trace.jsonl overrides output (default is
    $CLAWDBOT_STATE_DIR/logs/cache-trace.jsonl).
  • Optional filters: - CLAWDBOT_CACHE_TRACE_MESSAGES=0 to omit full messages (still logs digests).
    • CLAWDBOT_CACHE_TRACE_PROMPT=0 to omit prompt text.
    • CLAWDBOT_CACHE_TRACE_SYSTEM=0 to omit system prompt.

What you’ll see

  • One JSON object per line with stage, messagesDigest, per‑message messageFingerprints, and the actual messages if enabled.
  • The most important line is stage: "stream:context" — that is the exact payload pi‑mono is sending. If this diverges from earlier stages, you’ve
    found the mutation point.

@parubets
Copy link
Author

Added an opt‑in “cache trace” debugger to capture exactly what context is sent to Anthropic on each inference. It’s a standalone module (src/
agents/cache-trace.ts) that writes JSONL to a separate file and records per‑stage snapshots (session loaded → sanitized → limited → prompt →
images → stream context → after). The stream wrapper logs the actual context.messages handed to pi‑ai, plus stable hashes and per‑message
fingerprints, so we can diff when/where history changes and explain prompt‑cache misses.

Enable via env:

  • CLAWDBOT_CACHE_TRACE=1
  • optional CLAWDBOT_CACHE_TRACE_FILE=... (default $CLAWDBOT_STATE_DIR/logs/cache-trace.jsonl)
  • optional CLAWDBOT_CACHE_TRACE_MESSAGES=0, ..._PROMPT=0, ..._SYSTEM=0 to reduce payloads.

steipete added a commit to parubets/clawdbot that referenced this pull request Jan 21, 2026
Andrii and others added 2 commits January 21, 2026 10:23
Added a standalone cache tracing module and wired it into the embedded
runner so you can capture message flow and the exact context sent to
  Anthropic in a separate JSONL file.

  What changed

  - New tracing module: src/agents/cache-trace.ts (self‑contained,
env‑gated, writes JSONL, computes per‑message digests).
  - Hook points in src/agents/pi-embedded-runner/run/attempt.ts: logs
stage snapshots (loaded/sanitized/limited/prompt/stream/after) and wraps
the
    stream fn to record the real context.messages at send time.

  How to enable

  - CLAWDBOT_CACHE_TRACE=1 enables tracing.
  - CLAWDBOT_CACHE_TRACE_FILE=~/.clawdbot/logs/cache-trace.jsonl
overrides output (default is
$CLAWDBOT_STATE_DIR/logs/cache-trace.jsonl).
  - Optional filters:
      - CLAWDBOT_CACHE_TRACE_MESSAGES=0 to omit full messages (still
logs digests).
      - CLAWDBOT_CACHE_TRACE_PROMPT=0 to omit prompt text.
      - CLAWDBOT_CACHE_TRACE_SYSTEM=0 to omit system prompt.

  What you’ll see

  - One JSON object per line with stage, messagesDigest, per‑message
messageFingerprints, and the actual messages if enabled.
  - The most important line is stage: "stream:context" — that is the
exact payload pi‑mono is sending. If this diverges from earlier stages,
you’ve
    found the mutation point.
@steipete steipete force-pushed the fix-debug-ttl-cache branch from 5704b33 to 97e8f9d Compare January 21, 2026 10:35
@steipete steipete merged commit 86ddd3c into openclaw:main Jan 21, 2026
18 of 22 checks passed
@steipete
Copy link
Contributor

Landed via temp rebase onto main.\n\n- Gate: pnpm lint && pnpm build && pnpm test\n- Land commit: 97e8f9d\n- Merge commit: 86ddd3c\n\nThanks @parubets!

@parubets
Copy link
Author

Here’s the full‑file analysis (all 67 lines, 9 runs), plus a Python script you can reuse.

Findings (from this trace only)

  • Within each run, session:loaded, session:sanitized, and session:limited are identical (no in‑run history mutations detected).
  • Every run shows a digest change at images -> stream (expected: the new user prompt is appended before send).
  • 3 cross‑run mismatches where the next run’s session:loaded does not equal the previous run’s session:after:
    • e2deedf1 -> 4b92a204
    • 4b92a204 -> a2ee6008
    • c6f73635 -> 5eb6af4b
      In each case, exactly one message fingerprint changes, and it’s always a toolResult for the read tool with the same toolCallId. This means
      the stored history was mutated between runs (likely by pruning/sanitization or re‑serialization of tool results). That is enough to
      invalidate Anthropic prompt cache and trigger new cache writes.

Cache behavior timeline (from session:after usage)

  • b32a...: cacheRead 0, cacheWrite 12596 (new cache)
  • 109a...: cacheRead 0, cacheWrite 12963 (new cache)
  • 79fc...: cacheRead 12963, cacheWrite 182 (cache hit + small append)
  • e2de...: cacheRead 13938, cacheWrite 194 (cache hit + small append)
  • 4b92...: cacheRead 14316, cacheWrite 9624 (big rewrite; history mismatch vs previous run)
  • a2ee...: cacheRead 23940, cacheWrite 479 (hit + small append; history mismatch vs previous run)
  • c6f7...: cacheRead 25321, cacheWrite 744 (hit + small append)
  • 5eb6...: cacheRead 0, cacheWrite 26649 (full miss; history mismatch vs previous run)
  • 5958...: cacheRead 0, cacheWrite 27872 (full miss; history did match prev run, so miss is likely due to non‑history prefix changes: system/
    steering/hidden messages, or cache TTL/metadata)

Interpretation

  • The cache rewrites you see do correlate with actual history mutation between runs, and those mutations are localized to toolResult read entries.
  • One full cache miss (5958...) happens without a history mismatch, so there is likely another moving input not captured in this trace (system/
    steering messages, hidden prompts, or API behavior).

@parubets
Copy link
Author

based on the cache‑trace you gave me, the only history changes between runs are on toolResult entries for the read tool, and those
changes happen between runs (persist/load), not inside the Clawdbot run pipeline. That points to pi‑mono’s SessionManager persistence/reload path,
not Clawdbot’s in‑run sanitizers.

Why I’m confident:

  • For every run, session:loaded, session:sanitized, session:limited are identical → no Clawdbot mutation during the run.
  • The three cross‑run mismatches are always a single toolResult (toolName read, same toolCallId), with identical text content.
  • That means the difference is in non‑text, likely non‑serialized metadata (e.g., details or undefined fields) that JSONL persistence drops. That
    happens in pi‑mono (SessionManager uses JSON.stringify).

So: not clawdbot logic, it’s pi‑mono session save/load shape drift. It likely doesn’t change the actual Anthropic payload (tool result text is the
same), but it does change our cache‑trace fingerprint.

If you want to be 100% sure, I can:

  1. Patch cache‑trace to hash the JSON‑serialized message (so undefined fields don’t cause false diffs), and
  2. Add a pi‑mono hook to hash pre‑persist vs post‑reload entries to prove persistence‑only changes.

@parubets
Copy link
Author

Root cause (confirmed)

  • ToolResult messages in pi‑mono are created with a details key even when it’s undefined (in packages/agent/src/agent-loop.ts).
  • The read tool in packages/coding-agent/src/core/tools/read.ts usually returns details as undefined (only set when truncation occurs), so most
    read tool results carry details: undefined in memory.
  • Session persistence in packages/coding-agent/src/core/session-manager.ts uses JSON.stringify, which drops undefined properties. So details:
    undefined disappears on disk. When the session reloads, the toolResult has no details field at all.
  • If your fingerprinting/hashing treats “missing property” vs “present but undefined” as different, you get exactly the “same toolCallId/content
    but different fingerprint” symptom, and it shows up most often for read tool results.

Fields dropped on disk

  • toolResult.details when it’s undefined (this is the dominant culprit).
  • Potentially other optional fields across messages, but the concrete offender in your trace is details for read tool results.

Fix options

  1. Normalize at construction (best local fix)
    • Only include details in the ToolResultMessage if it’s defined.
    • Example: ...(details === undefined ? {} : { details }).
    • This keeps in‑memory and persisted objects identical.
  2. Normalize before persist (systemic fix)
    • In SessionManager, deep‑prune undefined fields before JSON.stringify.
    • Pros: fixes all message types consistently. Cons: more work and larger surface area.
  3. Normalize before hashing (Clawdbot-side fix)
    • Canonicalize messages by removing undefined keys or by JSON‑serialize/parse before hashing.
    • Fixes fingerprint diffs without touching pi‑mono persistence.

Net: the mutation is not semantic, it’s a serialization artifact (undefined vs missing).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants