Commit 5d357ba
authored
feat: enable Anthropic prompt caching on system prompt and tools (#69)
* feat: enable Anthropic prompt caching on system prompt and tools
Mark the rendered system prompt and the tool block with cache_control
breakpoints when calling Anthropic models. The static prefix (~4-5K
tokens of system prompt + 15+ tool definitions) was being re-billed at
full input rate on every turn, every retry, and every research
sub-agent iteration (up to 60 per task).
With ephemeral cache breakpoints, subsequent turns within the 5-minute
TTL are billed at cache-read pricing (~10% of input cost). Expected
savings: 40-50% input tokens on multi-turn conversations, 60-80% on
research sub-agent loops.
Caching is GA in the Anthropic API and natively supported by litellm
1.83+ via cache_control blocks (no beta header required). Non-Anthropic
models (HF router, OpenAI) are passed through unchanged.
The helper does not mutate the caller's message list or tool list, so
the persisted ContextManager.items history stays in its original
string-content form.
* refactor: hoist prompt_caching imports to module level, drop cached_ prefix1 parent e2552e8 commit 5d357ba
4 files changed
Lines changed: 77 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| 16 | + | |
| 17 | + | |
16 | 18 | | |
17 | 19 | | |
18 | 20 | | |
| |||
114 | 116 | | |
115 | 117 | | |
116 | 118 | | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
117 | 122 | | |
118 | 123 | | |
119 | 124 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
| |||
296 | 297 | | |
297 | 298 | | |
298 | 299 | | |
| 300 | + | |
299 | 301 | | |
300 | 302 | | |
301 | 303 | | |
| |||
390 | 392 | | |
391 | 393 | | |
392 | 394 | | |
| 395 | + | |
393 | 396 | | |
394 | 397 | | |
395 | 398 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| 18 | + | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
| |||
323 | 324 | | |
324 | 325 | | |
325 | 326 | | |
| 327 | + | |
326 | 328 | | |
327 | | - | |
| 329 | + | |
328 | 330 | | |
329 | 331 | | |
330 | 332 | | |
| |||
348 | 350 | | |
349 | 351 | | |
350 | 352 | | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
351 | 356 | | |
352 | | - | |
353 | | - | |
| 357 | + | |
| 358 | + | |
354 | 359 | | |
355 | 360 | | |
356 | 361 | | |
| |||
446 | 451 | | |
447 | 452 | | |
448 | 453 | | |
| 454 | + | |
449 | 455 | | |
450 | | - | |
| 456 | + | |
451 | 457 | | |
452 | 458 | | |
453 | 459 | | |
| |||
0 commit comments