3 unstable releases
0.2.7 | Jul 26, 2025 |
---|---|
0.2.6 | Jul 26, 2025 |
0.1.0 | Jul 25, 2025 |
#780 in Command line utilities
38KB
652 lines
LLM Pricing
A CLI tool to visualize OpenRouter model pricing and calculate actual request costs in a clean, tabular format.
Features
- 📊 Tabular display of model pricing per 1M tokens
- 🧮 Cost calculation for actual requests with input/output tokens
- 💾 Cache pricing support with TTL-based pricing (5min vs 1h)
- 🔍 Filter models by name or provider (e.g.,
anthropic
,sonnet
) - 📝 Verbose mode showing all model details
- 🌐 Live data fetched from OpenRouter API (docs, api reference)
Quick Start
Calculate the cost of a request with 10,000 input tokens, 200 output tokens, and 9,500 cached tokens:
llm-pricing calc 10000 200 -c 9500 opus-4 gpt-4.1
Cost calculation: 10000 input + 200 output (9500 cached, 5m TTL)
Model | Input | Output | Cache Read | Cache Write | Total
------------------------+-----------+-----------+------------+-------------+----------
anthropic/claude-opus-4 | $0.000000 | $0.015000 | $0.014250 | $0.009375 | $0.038625
openai/gpt-4.1 | $0.001000 | $0.001600 | $0.004750 | $0.000000 | $0.007350
openai/gpt-4.1-mini | $0.000200 | $0.000320 | $0.000950 | $0.000000 | $0.001470
openai/gpt-4.1-nano | $0.000050 | $0.000080 | $0.000237 | $0.000000 | $0.000367
Installation
From Releases
Download the latest binary for your platform from the releases page.
From crates.io
cargo install llm-pricing
From Source
git clone https://github.com/tekacs/llm-pricing.git
cd llm-pricing
cargo install --path .
Usage
Calculate Request Costs
Calculate the actual cost of a request with specific token counts:
llm-pricing calc 10000 200 opus-4
Cost calculation: 10000 input + 200 output
Model | Input | Output | Total
------------------------+-----------+-----------+----------
anthropic/claude-opus-4 | $0.150000 | $0.015000 | $0.165000
With cached tokens (uses 5-minute TTL by default):
llm-pricing calc 10000 200 -c 9500 opus-4
Cost calculation: 10000 input + 200 output (9500 cached, 5m TTL)
Model | Input | Output | Cache Read | Cache Write | Total
------------------------+-----------+-----------+------------+-------------+----------
anthropic/claude-opus-4 | $0.000000 | $0.015000 | $0.014250 | $0.009375 | $0.038625
With 1-hour cache TTL (higher write costs):
llm-pricing calc 10000 200 -c 9500 --ttl 60 opus-4
Understanding Cache vs No-Cache Pricing
The -c
flag indicates you're using caching rules, which affects pricing even when no tokens are cached:
Without -c
flag (no caching):
llm-pricing calc 10000 200 opus-4
Cost calculation: 10000 input + 200 output
Model | Input | Output | Total
------------------------+-----------+-----------+----------
anthropic/claude-opus-4 | $0.150000 | $0.015000 | $0.165000
With -c 0
flag (using caching, 0 cached tokens):
llm-pricing calc 10000 200 -c 0 opus-4
Cost calculation: 10000 input + 200 output
Model | Input | Output | Cache Read | Cache Write | Total
------------------------+-----------+-----------+------------+-------------+----------
anthropic/claude-opus-4 | $0.000000 | $0.015000 | $0.000000 | $0.187500 | $0.202500
When using caching (-c
flag), all new tokens are written to cache at cache write prices (1.25x base price for 5-minute TTL), which replaces the regular input cost.
List Models
Basic Usage
Show all models in a table format:
llm-pricing
Model | Input | Output | Cache Read | Cache Write
------------------------------------------+-------+--------+------------+------------
anthropic/claude-opus-4 | 15.00 | 75.00 | 1.50 | 18.75
anthropic/claude-sonnet-4 | 3.00 | 15.00 | 0.30 | 3.75
google/gemini-2.5-pro | 1.25 | 10.00 | N/A | N/A
x-ai/grok-4 | 3.00 | 15.00 | 0.75 | N/A
openai/gpt-4o | 2.50 | 10.00 | N/A | N/A
...
Filter by Provider
Show only Anthropic models:
llm-pricing anthropic
Model | Input | Output | Cache Read | Cache Write
------------------------------------------+-------+--------+------------+------------
anthropic/claude-opus-4 | 15.00 | 75.00 | 1.50 | 18.75
anthropic/claude-sonnet-4 | 3.00 | 15.00 | 0.30 | 3.75
anthropic/claude-3.5-sonnet | 3.00 | 15.00 | 0.30 | 3.75
anthropic/claude-3.5-haiku | 0.80 | 4.00 | 0.08 | 1.00
anthropic/claude-3-opus | 15.00 | 75.00 | 1.50 | 18.75
...
Filter by Model Name
Show models containing "sonnet":
llm-pricing sonnet
Model | Input | Output | Cache Read | Cache Write
------------------------------------------+-------+--------+------------+------------
anthropic/claude-sonnet-4 | 3.00 | 15.00 | 0.30 | 3.75
anthropic/claude-3.7-sonnet | 3.00 | 15.00 | 0.30 | 3.75
anthropic/claude-3.5-sonnet | 3.00 | 15.00 | 0.30 | 3.75
anthropic/claude-3-sonnet | 3.00 | 15.00 | 0.30 | 3.75
Verbose Output
Get detailed information about models with the -v
flag:
llm-pricing opus-4 -v
=== ANTHROPIC ===
Model: anthropic/claude-opus-4
Name: Anthropic: Claude Opus 4
Description: Claude Opus 4 is benchmarked as the world's best coding model, at time of release,
bringing sustained performance on complex, long-running tasks and agent workflows. It sets new
benchmarks in software engineering, achieving leading results on SWE-bench (72.5%) and
Terminal-bench (43.2%).
Pricing:
Input: $15.00 per 1M tokens
Output: $75.00 per 1M tokens
Cache Read: $1.50 per 1M tokens
Cache Write: $18.75 per 1M tokens
Per Request: $0
Image: $0.024
Context Length: 200000 tokens
Modality: text+image->text
Tokenizer: Claude
Max Completion Tokens: 32000
Moderated: true
Understanding the Output
Table Columns
- Model: The model identifier used in API calls
- Input: Cost per 1M input tokens (USD)
- Output: Cost per 1M output tokens (USD)
- Cache Read: Cost per 1M tokens read from cache (when available)
- Cache Write: Cost per 1M tokens written to cache (when available)
Cache Pricing
Some providers (like Anthropic and xAI) offer caching to reduce costs on repeated content:
- Cache Read: Much cheaper than regular input tokens (typically 10x less)
- Cache Write: Slightly more expensive than input tokens (to build the cache)
- N/A: Model doesn't support caching
CLI Options
List Command (Default)
llm-pricing [OPTIONS] [FILTERS...]
Arguments:
[FILTERS...] Filter models by name (e.g., 'anthropic/', 'sonnet')
Options:
-v, --verbose Show verbose output with all model information
-h, --help Print help
Calculate Command
llm-pricing calc [OPTIONS] <INPUT> <OUTPUT> [FILTERS...]
Arguments:
<INPUT> Number of input tokens
<OUTPUT> Number of output tokens
[FILTERS...] Filter models by name (e.g., 'anthropic/', 'sonnet')
Options:
-c, --cached <CACHED> Number of cached input tokens read from cache. Using this flag enables caching pricing rules.
-t, --ttl <TTL> Cache TTL in minutes (affects pricing) [default: 5]
-h, --help Print help
Development
This project uses just for task running:
# Show available tasks
just
# Build the project
just build
# Run with arguments
just run anthropic -v
# Format and lint
just fmt
just clippy
License
MIT License - see LICENSE for details.
Dependencies
~7–20MB
~248K SLoC