Third-Party Agent Compatibility: Hermes, OpenClaw, and the Billing Tag Problem #8

askalf · 2026-04-11T00:35:53Z

askalf
Apr 11, 2026
Maintainer

Overview

We tested two agent frameworks against dario tonight: Hermes (NousResearch) and OpenClaw. Both work. The story behind why each one works — and where one of them has a hidden time bomb — is worth documenting.

Background: The Billing Reclassification Problem

Claude Max subscriptions include a 5-hour rolling budget (five_hour claim) and a 7-day ceiling. Requests made through Claude Code's binary stay on this budget. Requests made via third-party tools using the same OAuth token can be reclassified to overage (pay-per-token) after roughly one hour of sustained use.

This was first surfaced in issue #7 by @belangertrading, who reported requests working for ~1 hour before failing. The root cause was not a rate limit but a billing routing issue: Anthropic's infrastructure detects the request fingerprint does not match the Claude Code binary and routes it to a different billing tier.

The reclassification is visible in the response headers:

anthropic-ratelimit-unified-status: five_hour   # correct — on budget
anthropic-ratelimit-unified-status: overage     # reclassified — burning extra credits

The Claude Code Request Fingerprint

Through binary reverse-engineering and MITM traffic capture (capture server on ports 9881–9887), we identified seven signals that Anthropic's infrastructure uses to classify requests as genuine Claude Code traffic.

1. Billing Tag (System Prompt Block 0)

The most critical signal. Claude Code injects an x-anthropic-billing-header into the first block of the system prompt on every request:

x-anthropic-billing-header: cc_version=2.1.100.47a; cc_entrypoint=cli; cch=3f8a2;

This block is not cached (no cache_control). The two dynamic components:

Build tag — 3-char hex, computed via the Oz$ function in the Claude Code binary:

SHA-256("59cf53e54c78" + msg[4] + msg[7] + msg[20] + version).slice(0, 3)

The seed 59cf53e54c78 is a constant embedded in the binary (constant XGA). Characters at positions 4, 7, and 20 of the first user message are extracted. This makes the build tag deterministic per (message, version) pair.

cch — 5-char hex, confirmed random per request via MITM (10 identical requests → 10 unique cch values):

crypto.randomBytes(3).toString("hex").slice(0, 5)

2. Three-Block System Prompt Structure

Real Claude Code always structures the system prompt as exactly three blocks:

[
  { "type": "text", "text": "x-anthropic-billing-header: cc_version=2.1.100.47a; cc_entrypoint=cli; cch=3f8a2;" },
  { "type": "text", "text": "You are a Claude agent, built on Anthropic's Claude Agent SDK.", "cache_control": { "type": "ephemeral", "ttl": "1h" } },
  { "type": "text", "text": "<actual system prompt>", "cache_control": { "type": "ephemeral", "ttl": "1h" } }
]

Block 0 has no cache control (billing tag rotates per request). Blocks 1–2 are cached 1 hour.

3. Beta Header Order

Exact order from MITM capture:

anthropic-beta: claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,fine-grained-tool-streaming-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05,advisor-tool-2026-03-01,effort-2025-11-24,fast-mode-2026-02-01

4. Device Identity Metadata

Every request includes metadata.user_id as a JSON-stringified object:

{
  "metadata": {
    "user_id": "{\"device_id\":\"60bdf846e8...\",\"account_uuid\":\"...\",\"session_id\":\"...\"}"
  }
}

device_id comes from the userID field in ~/.claude.json (not installId — legacy field). session_id is a fresh randomUUID() per proxy start.

5. Adaptive Thinking + Effort

Claude Code enables adaptive thinking on all non-Haiku models:

{
  "thinking": { "type": "adaptive" },
  "output_config": { "effort": "medium" }
}

Haiku 4.5 does not support thinking and must not receive these fields.

6. Static Headers (MITM-verified)

user-agent: claude-cli/2.1.100 (external, cli)
x-app: cli
x-stainless-runtime-version: v24.3.0
x-stainless-package-version: 0.81.0
x-client-request-id: <randomUUID per request>
x-stainless-timeout: 600  # first request; 300 thereafter

7. Context Management

{
  "context_management": {
    "edits": [{ "type": "clear_thinking_20251015", "keep": "all" }]
  }
}

Required for long multi-turn sessions. Paired with the context-management-2025-06-27 beta.

OpenClaw

OpenClaw passed every test without a single failure across all patterns. This is a direct result of how it is built.

OpenClaw uses the Anthropic Messages API natively with a clean WebSocket gateway protocol. It makes no assumptions about billing routing, does not spoof client identity, and sends requests in standard format. When combined with dario, the proxy handles all fingerprint injection upstream — and because OpenClaw's request format is well-structured and spec-compliant, there is nothing to conflict with.

Test results

| Pattern | Status |
|---------|-------|-----|
| Non-streaming (Haiku, Sonnet, Opus) | ✅ Pass |
| Streaming SSE | ✅ Pass |
| Tool use / function calling | ✅ Pass |
| Multi-turn conversation | ✅ Pass |
| OpenAI compat /v1/chat/completions | ✅ Pass | ✅ Pass |

Why it works cleanly

OpenClaw does not try to manage OAuth identity, inject system headers, or negotiate billing routing itself. It sends a well-formed request and lets the proxy layer do its job. This is the correct architecture for a tool that routes through a proxy — clean separation of concerns between the client and the infrastructure layer.

The result: OpenClaw users running through dario get full billing tag parity and correct five_hour classification — with zero configuration beyond setting ANTHROPIC_BASE_URL.

Setup

ANTHROPIC_BASE_URL=http://localhost:3456
ANTHROPIC_API_KEY=dario

That is the entire configuration. OpenClaw connects, the proxy handles the rest.

Hermes Compatibility Analysis

We performed a full source audit of Hermes' agent/anthropic_adapter.py. Here is what Hermes sends versus what is required for billing classification parity.

What Hermes does correctly

Sends claude-code-20250219 and oauth-2025-04-20 betas for OAuth tokens
Sets user-agent: claude-cli/<version> (external, cli) and x-app: cli when using OAuth directly with Anthropic
Dynamically detects the installed Claude Code version via claude --version to avoid version-mismatch 400s
Supports thinking: { type: "adaptive" } and output_config.effort for Claude 4.6 models
Correctly skips thinking for Haiku
Sends fine-grained-tool-streaming-2025-05-14 and fast-mode-2026-02-01 betas
Implements token refresh (refresh_anthropic_oauth_pure) with form-encoded and JSON fallback
Implements a full PKCE OAuth flow without requiring Claude Code (run_hermes_oauth_login_pure)
Correctly maps speed: "fast" for Opus 4.6 fast mode (~2.5x throughput)
Third-party endpoint detection via _is_third_party_anthropic_endpoint() — correctly disables OAuth-specific headers when connecting through a localhost proxy

What Hermes does not do

Missing	Impact
No `x-anthropic-billing-header` injection	Requests classified as third-party after ~1 hour
No build tag computation (`Oz$` algorithm)	Missing billing fingerprint
No random `cch` per request	No per-request entropy in billing tag
No 3-block system prompt structure	No delivery mechanism for billing tag
No `metadata.user_id` device identity	Session tracking not transmitted

Hermes' build_anthropic_kwargs() prepends "You are Claude Code, Anthropic's official CLI for Claude." to the system prompt when is_oauth=True, and replaces "Hermes Agent" with "Claude Code" in system content. But it does not include the billing tag block or the 3-block structure. Without a proxy, Hermes users will encounter billing reclassification during sustained sessions.

Hermes → dario request flow

When ANTHROPIC_BASE_URL points to dario, Hermes correctly identifies it as a third-party endpoint and uses x-api-key auth. The proxy then handles the full transformation:

Hermes SDK request (x-api-key, Hermes betas only)
        ↓  POST /v1/messages
  dario proxy
        → Inject billing tag (build tag + random cch)
        → Wrap system prompt as 3-block structure
        → Add device identity metadata.user_id
        → Set adaptive thinking / output_config if not present
        → Replace x-api-key with Bearer OAuth token
        → Add full Claude Code beta set (deduplicated)
        → Add all static Claude Code headers
        ↓  POST https://api.anthropic.com/v1/messages
  Anthropic API
        → Classifies as genuine Claude Code traffic
        → Routes to five_hour budget

Test results (8/8 pass)

| # | Test | Status |
|---|------|-------|-----|
| 1 | Non-stream Haiku (x-api-key auth) | ✅ PASS |
| 2 | Streaming Haiku | ✅ PASS |
| 3 | Tool use with mcp_ prefix (Sonnet) | ✅ PASS |
| 4 | Adaptive thinking, effort=high (Opus 4.6) | ✅ PASS |
| 5 | Hermes-specific betas forwarded | ✅ PASS |
| 6 | OpenAI compat /v1/chat/completions | ✅ PASS |
| 7 | Multi-turn with system prompt | ✅ PASS |
| 8 | OpenAI model mapping (gpt-4 → opus) | ✅ PASS |

Setup

ANTHROPIC_BASE_URL=http://localhost:3456
ANTHROPIC_API_KEY=dario

Beta Header Update

Our beta header set was updated to include two betas from Hermes' full list that we were previously missing:

Before:

claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05,advisor-tool-2026-03-01,effort-2025-11-24

After:

claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,fine-grained-tool-streaming-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05,advisor-tool-2026-03-01,effort-2025-11-24,fast-mode-2026-02-01

Added:

fine-grained-tool-streaming-2025-05-14 — enables per-token streaming of tool call arguments (Hermes uses this for real-time tool call rendering)
fast-mode-2026-02-01 — enables speed: "fast" parameter for ~2.5x output throughput on Opus 4.6

Client-provided betas are now deduplicated against our base set before appending.

E2E Suite (15/15 pass)

✅ Health endpoint      ✅ Status endpoint     ✅ Models endpoint
✅ Haiku non-stream    ✅ Sonnet non-stream   ✅ Opus non-stream
✅ Sonnet stream       ✅ Opus stream         ✅ OpenAI non-stream
✅ OpenAI stream       ✅ OpenAI model map    ✅ Tool use
✅ Rate limit headers  ✅ Pool routing        ✅ Pool state

Billing claim snapshot after 15 sequential requests:

anthropic-ratelimit-unified-representative-claim: five_hour
anthropic-ratelimit-unified-5h-utilization:       0.33
anthropic-ratelimit-unified-7d-utilization:       0.13
anthropic-ratelimit-unified-status:               five_hour

Summary

Framework	Works through dario	Billing claim	Notes
OpenClaw	✅ Yes	five_hour	Zero config — clean protocol, no conflicts
Hermes	✅ Yes	five_hour	One env var — proxy injects billing fingerprint
Direct OAuth (no proxy)	⚠️ Partial	overage after ~1hr	Missing billing tag → reclassification

Recommendations

For OpenClaw users: Set ANTHROPIC_BASE_URL=http://localhost:3456. No other changes needed.

For Hermes users: Same setup. Hermes correctly detects non-Anthropic base URLs and switches to third-party mode automatically. No other changes needed.

For framework authors: Implement the billing tag injection or route through a proxy that does. The full algorithm is documented in issue #7.

For dario users: No action required. The beta header update ships in v2.8.8.

Tested 2026-04-10. Claude Code v2.1.100. Live Anthropic API via OAuth. All billing claims confirmed from response headers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Third-Party Agent Compatibility: Hermes, OpenClaw, and the Billing Tag Problem #8

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Third-Party Agent Compatibility: Hermes, OpenClaw, and the Billing Tag Problem #8

Uh oh!

Uh oh!

askalf Apr 11, 2026 Maintainer

Overview

Background: The Billing Reclassification Problem

The Claude Code Request Fingerprint

1. Billing Tag (System Prompt Block 0)

2. Three-Block System Prompt Structure

3. Beta Header Order

4. Device Identity Metadata

5. Adaptive Thinking + Effort

6. Static Headers (MITM-verified)

7. Context Management

OpenClaw

Test results

Why it works cleanly

Setup

Hermes Compatibility Analysis

What Hermes does correctly

What Hermes does not do

Hermes → dario request flow

Test results (8/8 pass)

Setup

Beta Header Update

E2E Suite (15/15 pass)

Summary

Recommendations

Replies: 0 comments

askalf
Apr 11, 2026
Maintainer