You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We tested two agent frameworks against dario tonight: Hermes (NousResearch) and OpenClaw. Both work. The story behind why each one works — and where one of them has a hidden time bomb — is worth documenting.
Background: The Billing Reclassification Problem
Claude Max subscriptions include a 5-hour rolling budget (five_hour claim) and a 7-day ceiling. Requests made through Claude Code's binary stay on this budget. Requests made via third-party tools using the same OAuth token can be reclassified to overage (pay-per-token) after roughly one hour of sustained use.
This was first surfaced in issue #7 by @belangertrading, who reported requests working for ~1 hour before failing. The root cause was not a rate limit but a billing routing issue: Anthropic's infrastructure detects the request fingerprint does not match the Claude Code binary and routes it to a different billing tier.
The reclassification is visible in the response headers:
anthropic-ratelimit-unified-status: five_hour # correct — on budget
anthropic-ratelimit-unified-status: overage # reclassified — burning extra credits
The Claude Code Request Fingerprint
Through binary reverse-engineering and MITM traffic capture (capture server on ports 9881–9887), we identified seven signals that Anthropic's infrastructure uses to classify requests as genuine Claude Code traffic.
1. Billing Tag (System Prompt Block 0)
The most critical signal. Claude Code injects an x-anthropic-billing-header into the first block of the system prompt on every request:
The seed 59cf53e54c78 is a constant embedded in the binary (constant XGA). Characters at positions 4, 7, and 20 of the first user message are extracted. This makes the build tag deterministic per (message, version) pair.
cch — 5-char hex, confirmed random per request via MITM (10 identical requests → 10 unique cch values):
crypto.randomBytes(3).toString("hex").slice(0, 5)
2. Three-Block System Prompt Structure
Real Claude Code always structures the system prompt as exactly three blocks:
[
{ "type": "text", "text": "x-anthropic-billing-header: cc_version=2.1.100.47a; cc_entrypoint=cli; cch=3f8a2;" },
{ "type": "text", "text": "You are a Claude agent, built on Anthropic's Claude Agent SDK.", "cache_control": { "type": "ephemeral", "ttl": "1h" } },
{ "type": "text", "text": "<actual system prompt>", "cache_control": { "type": "ephemeral", "ttl": "1h" } }
]
Block 0 has no cache control (billing tag rotates per request). Blocks 1–2 are cached 1 hour.
Required for long multi-turn sessions. Paired with the context-management-2025-06-27 beta.
OpenClaw
OpenClaw passed every test without a single failure across all patterns. This is a direct result of how it is built.
OpenClaw uses the Anthropic Messages API natively with a clean WebSocket gateway protocol. It makes no assumptions about billing routing, does not spoof client identity, and sends requests in standard format. When combined with dario, the proxy handles all fingerprint injection upstream — and because OpenClaw's request format is well-structured and spec-compliant, there is nothing to conflict with.
OpenClaw does not try to manage OAuth identity, inject system headers, or negotiate billing routing itself. It sends a well-formed request and lets the proxy layer do its job. This is the correct architecture for a tool that routes through a proxy — clean separation of concerns between the client and the infrastructure layer.
The result: OpenClaw users running through dario get full billing tag parity and correct five_hour classification — with zero configuration beyond setting ANTHROPIC_BASE_URL.
That is the entire configuration. OpenClaw connects, the proxy handles the rest.
Hermes Compatibility Analysis
We performed a full source audit of Hermes' agent/anthropic_adapter.py. Here is what Hermes sends versus what is required for billing classification parity.
What Hermes does correctly
Sends claude-code-20250219 and oauth-2025-04-20 betas for OAuth tokens
Sets user-agent: claude-cli/<version> (external, cli) and x-app: cli when using OAuth directly with Anthropic
Dynamically detects the installed Claude Code version via claude --version to avoid version-mismatch 400s
Supports thinking: { type: "adaptive" } and output_config.effort for Claude 4.6 models
Correctly skips thinking for Haiku
Sends fine-grained-tool-streaming-2025-05-14 and fast-mode-2026-02-01 betas
Implements token refresh (refresh_anthropic_oauth_pure) with form-encoded and JSON fallback
Implements a full PKCE OAuth flow without requiring Claude Code (run_hermes_oauth_login_pure)
Correctly maps speed: "fast" for Opus 4.6 fast mode (~2.5x throughput)
Third-party endpoint detection via _is_third_party_anthropic_endpoint() — correctly disables OAuth-specific headers when connecting through a localhost proxy
What Hermes does not do
Missing
Impact
No x-anthropic-billing-header injection
Requests classified as third-party after ~1 hour
No build tag computation (Oz$ algorithm)
Missing billing fingerprint
No random cch per request
No per-request entropy in billing tag
No 3-block system prompt structure
No delivery mechanism for billing tag
No metadata.user_id device identity
Session tracking not transmitted
Hermes' build_anthropic_kwargs() prepends "You are Claude Code, Anthropic's official CLI for Claude." to the system prompt when is_oauth=True, and replaces "Hermes Agent" with "Claude Code" in system content. But it does not include the billing tag block or the 3-block structure. Without a proxy, Hermes users will encounter billing reclassification during sustained sessions.
Hermes → dario request flow
When ANTHROPIC_BASE_URL points to dario, Hermes correctly identifies it as a third-party endpoint and uses x-api-key auth. The proxy then handles the full transformation:
Hermes SDK request (x-api-key, Hermes betas only)
↓ POST /v1/messages
dario proxy
→ Inject billing tag (build tag + random cch)
→ Wrap system prompt as 3-block structure
→ Add device identity metadata.user_id
→ Set adaptive thinking / output_config if not present
→ Replace x-api-key with Bearer OAuth token
→ Add full Claude Code beta set (deduplicated)
→ Add all static Claude Code headers
↓ POST https://api.anthropic.com/v1/messages
Anthropic API
→ Classifies as genuine Claude Code traffic
→ Routes to five_hour budget
For OpenClaw users: Set ANTHROPIC_BASE_URL=http://localhost:3456. No other changes needed.
For Hermes users: Same setup. Hermes correctly detects non-Anthropic base URLs and switches to third-party mode automatically. No other changes needed.
For framework authors: Implement the billing tag injection or route through a proxy that does. The full algorithm is documented in issue #7.
For dario users: No action required. The beta header update ships in v2.8.8.
Tested 2026-04-10. Claude Code v2.1.100. Live Anthropic API via OAuth. All billing claims confirmed from response headers.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Overview
We tested two agent frameworks against dario tonight: Hermes (NousResearch) and OpenClaw. Both work. The story behind why each one works — and where one of them has a hidden time bomb — is worth documenting.
Background: The Billing Reclassification Problem
Claude Max subscriptions include a 5-hour rolling budget (
five_hourclaim) and a 7-day ceiling. Requests made through Claude Code's binary stay on this budget. Requests made via third-party tools using the same OAuth token can be reclassified tooverage(pay-per-token) after roughly one hour of sustained use.This was first surfaced in issue #7 by @belangertrading, who reported requests working for ~1 hour before failing. The root cause was not a rate limit but a billing routing issue: Anthropic's infrastructure detects the request fingerprint does not match the Claude Code binary and routes it to a different billing tier.
The reclassification is visible in the response headers:
The Claude Code Request Fingerprint
Through binary reverse-engineering and MITM traffic capture (capture server on ports 9881–9887), we identified seven signals that Anthropic's infrastructure uses to classify requests as genuine Claude Code traffic.
1. Billing Tag (System Prompt Block 0)
The most critical signal. Claude Code injects an
x-anthropic-billing-headerinto the first block of the system prompt on every request:This block is not cached (no
cache_control). The two dynamic components:Build tag — 3-char hex, computed via the
Oz$function in the Claude Code binary:The seed
59cf53e54c78is a constant embedded in the binary (constantXGA). Characters at positions 4, 7, and 20 of the first user message are extracted. This makes the build tag deterministic per (message, version) pair.cch — 5-char hex, confirmed random per request via MITM (10 identical requests → 10 unique cch values):
2. Three-Block System Prompt Structure
Real Claude Code always structures the system prompt as exactly three blocks:
[ { "type": "text", "text": "x-anthropic-billing-header: cc_version=2.1.100.47a; cc_entrypoint=cli; cch=3f8a2;" }, { "type": "text", "text": "You are a Claude agent, built on Anthropic's Claude Agent SDK.", "cache_control": { "type": "ephemeral", "ttl": "1h" } }, { "type": "text", "text": "<actual system prompt>", "cache_control": { "type": "ephemeral", "ttl": "1h" } } ]Block 0 has no cache control (billing tag rotates per request). Blocks 1–2 are cached 1 hour.
3. Beta Header Order
Exact order from MITM capture:
4. Device Identity Metadata
Every request includes
metadata.user_idas a JSON-stringified object:{ "metadata": { "user_id": "{\"device_id\":\"60bdf846e8...\",\"account_uuid\":\"...\",\"session_id\":\"...\"}" } }device_idcomes from theuserIDfield in~/.claude.json(notinstallId— legacy field).session_idis a freshrandomUUID()per proxy start.5. Adaptive Thinking + Effort
Claude Code enables adaptive thinking on all non-Haiku models:
{ "thinking": { "type": "adaptive" }, "output_config": { "effort": "medium" } }Haiku 4.5 does not support thinking and must not receive these fields.
6. Static Headers (MITM-verified)
7. Context Management
{ "context_management": { "edits": [{ "type": "clear_thinking_20251015", "keep": "all" }] } }Required for long multi-turn sessions. Paired with the
context-management-2025-06-27beta.OpenClaw
OpenClaw passed every test without a single failure across all patterns. This is a direct result of how it is built.
OpenClaw uses the Anthropic Messages API natively with a clean WebSocket gateway protocol. It makes no assumptions about billing routing, does not spoof client identity, and sends requests in standard format. When combined with dario, the proxy handles all fingerprint injection upstream — and because OpenClaw's request format is well-structured and spec-compliant, there is nothing to conflict with.
Test results
| Pattern | Status |
|---------|-------|-----|
| Non-streaming (Haiku, Sonnet, Opus) | ✅ Pass |
| Streaming SSE | ✅ Pass |
| Tool use / function calling | ✅ Pass |
| Multi-turn conversation | ✅ Pass |
| OpenAI compat
/v1/chat/completions| ✅ Pass | ✅ Pass |Why it works cleanly
OpenClaw does not try to manage OAuth identity, inject system headers, or negotiate billing routing itself. It sends a well-formed request and lets the proxy layer do its job. This is the correct architecture for a tool that routes through a proxy — clean separation of concerns between the client and the infrastructure layer.
The result: OpenClaw users running through dario get full billing tag parity and correct
five_hourclassification — with zero configuration beyond settingANTHROPIC_BASE_URL.Setup
That is the entire configuration. OpenClaw connects, the proxy handles the rest.
Hermes Compatibility Analysis
We performed a full source audit of Hermes'
agent/anthropic_adapter.py. Here is what Hermes sends versus what is required for billing classification parity.What Hermes does correctly
claude-code-20250219andoauth-2025-04-20betas for OAuth tokensuser-agent: claude-cli/<version> (external, cli)andx-app: cliwhen using OAuth directly with Anthropicclaude --versionto avoid version-mismatch 400sthinking: { type: "adaptive" }andoutput_config.effortfor Claude 4.6 modelsfine-grained-tool-streaming-2025-05-14andfast-mode-2026-02-01betasrefresh_anthropic_oauth_pure) with form-encoded and JSON fallbackrun_hermes_oauth_login_pure)speed: "fast"for Opus 4.6 fast mode (~2.5x throughput)_is_third_party_anthropic_endpoint()— correctly disables OAuth-specific headers when connecting through a localhost proxyWhat Hermes does not do
x-anthropic-billing-headerinjectionOz$algorithm)cchper requestmetadata.user_iddevice identityHermes'
build_anthropic_kwargs()prepends"You are Claude Code, Anthropic's official CLI for Claude."to the system prompt whenis_oauth=True, and replaces "Hermes Agent" with "Claude Code" in system content. But it does not include the billing tag block or the 3-block structure. Without a proxy, Hermes users will encounter billing reclassification during sustained sessions.Hermes → dario request flow
When
ANTHROPIC_BASE_URLpoints to dario, Hermes correctly identifies it as a third-party endpoint and usesx-api-keyauth. The proxy then handles the full transformation:Test results (8/8 pass)
| # | Test | Status |
|---|------|-------|-----|
| 1 | Non-stream Haiku (x-api-key auth) | ✅ PASS |
| 2 | Streaming Haiku | ✅ PASS |
| 3 | Tool use with mcp_ prefix (Sonnet) | ✅ PASS |
| 4 | Adaptive thinking, effort=high (Opus 4.6) | ✅ PASS |
| 5 | Hermes-specific betas forwarded | ✅ PASS |
| 6 | OpenAI compat
/v1/chat/completions| ✅ PASS || 7 | Multi-turn with system prompt | ✅ PASS |
| 8 | OpenAI model mapping (gpt-4 → opus) | ✅ PASS |
Setup
Beta Header Update
Our beta header set was updated to include two betas from Hermes' full list that we were previously missing:
Before:
After:
Added:
fine-grained-tool-streaming-2025-05-14— enables per-token streaming of tool call arguments (Hermes uses this for real-time tool call rendering)fast-mode-2026-02-01— enablesspeed: "fast"parameter for ~2.5x output throughput on Opus 4.6Client-provided betas are now deduplicated against our base set before appending.
E2E Suite (15/15 pass)
Billing claim snapshot after 15 sequential requests:
Summary
Recommendations
For OpenClaw users: Set
ANTHROPIC_BASE_URL=http://localhost:3456. No other changes needed.For Hermes users: Same setup. Hermes correctly detects non-Anthropic base URLs and switches to third-party mode automatically. No other changes needed.
For framework authors: Implement the billing tag injection or route through a proxy that does. The full algorithm is documented in issue #7.
For dario users: No action required. The beta header update ships in v2.8.8.
Tested 2026-04-10. Claude Code v2.1.100. Live Anthropic API via OAuth. All billing claims confirmed from response headers.
Beta Was this translation helpful? Give feedback.
All reactions