Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ghanibot/nano-proxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nano-proxy

Lightweight LLM API proxy — routing, fallback, and cost tracking for 6 providers. Built to fix what LiteLLM got wrong.

Python License: MIT Version Windows Providers


The Problem

Switching LLM providers requires rewriting code. Rate limits crash your agents. Costs explode silently. LiteLLM "solves" this but ships 100+ dependencies, breaks on Windows, and takes seconds to import.

nano-proxy fixes all of it.


What Makes It Different

Problem with others nano-proxy solution
Switching providers = rewriting code One URL changebase_url="http://localhost:8765/groq/v1"
Rate limit crashes entire pipeline Automatic fallback — 429 → next provider, transparent to caller
100+ dependencies (LiteLLM) Minimal deps — only fastapi, httpx, pydantic, uvicorn
Cloud-only (OpenRouter, Portkey) Fully local — runs on your machine, data never leaves
No cost visibility Real-time cost tracking — per-provider USD counters, /cost endpoint
Windows support broken Windows-first — tested on PowerShell, no POSIX assumptions
Slow import / startup Fast start — proxy up in under 1 second

Quick Start

# Install
pip install git+https://github.com/ghanibot/nano-proxy.git

# Start proxy (default: http://localhost:8765)
nano-proxy start

# Start with config file
nano-proxy start --config configs/proxy.yaml

# Check status
nano-proxy status

# Cost report
nano-proxy cost

Integration — 1 Line Change

import anthropic
import openai

# Direct to Anthropic (no proxy)
client = anthropic.Anthropic()

# Via nano-proxy → Anthropic
client = anthropic.Anthropic(base_url="http://localhost:8765/anthropic")

# Via nano-proxy → Groq (llama/mixtral)
client = openai.OpenAI(
    api_key="anything",
    base_url="http://localhost:8765/groq/v1"
)

# Via nano-proxy → auto-route (proxy picks best available provider)
client = openai.OpenAI(
    api_key="anything",
    base_url="http://localhost:8765/auto"
)

Works with any SDK that supports custom base_url — anthropic, openai, langchain, nano-eval, nano-memory, nano-orchestrator.


Supported Providers

Provider Route Fallback Cost
Anthropic /anthropic/v1/... Yes Paid
OpenAI /openai/v1/... Yes Paid
Groq /groq/v1/... Yes Free tier
Google Gemini /gemini/v1/... Yes Free tier
Ollama /ollama/v1/... Yes Free, local
Mistral /mistral/v1/... Yes Paid

Configuration

# configs/proxy.yaml
host: "127.0.0.1"
port: 8765
log_requests: true

router:
  strategy: priority       # priority | round-robin | cheapest
  fallback: true
  fallback_order:          # explicit fallback sequence
    - groq
    - ollama
  timeout_seconds: 60.0
  max_retries: 2

budget:
  max_cost_usd: 10.0
  alert_at_percent: 0.8
  kill_on_exceed: false

providers:
  - name: anthropic
    base_url: "https://api.anthropic.com"
    api_key_env: "ANTHROPIC_API_KEY"
    priority: 0
    enabled: true

  - name: groq
    base_url: "https://api.groq.com/openai"
    api_key_env: "GROQ_API_KEY"
    priority: 1
    enabled: true

  - name: ollama
    base_url: "http://localhost:11434"
    api_key_env: ""
    priority: 2
    enabled: true

Routing Strategies

Strategy Behavior
priority Always try highest-priority provider first (default)
round-robin Distribute requests evenly across all providers
cheapest Route to lowest cost-per-token provider available

Fallback activates automatically when a provider returns 429 (rate limit) or 5xx (server error). The rate-limited provider is blocked for 60 seconds, then re-enabled.


API Endpoints

POST  /{provider}/v1/messages          → Anthropic format
POST  /{provider}/v1/chat/completions  → OpenAI format
POST  /auto/v1/...                     → auto-routed

GET   /health                          → provider status
GET   /cost                            → token + USD breakdown

Cost Response

{
  "total_usd": 0.0142,
  "providers": {
    "anthropic": { "requests": 12, "tokens": 8400, "cost_usd": 0.0112, "errors": 0 },
    "groq":      { "requests": 3,  "tokens": 1200, "cost_usd": 0.0003, "errors": 0 },
    "ollama":    { "requests": 5,  "tokens": 2100, "cost_usd": 0.0000, "errors": 0 }
  }
}

Architecture

Client (nano-eval / nano-memory / any SDK)
         │
         ▼
    nano-proxy  (FastAPI, port 8765)
    ├── Router          — strategy: priority | round-robin | cheapest
    │   └── RateLimitTracker  — per-provider backoff, auto-unblock after 60s
    ├── GenericProvider — rewrites auth header, forwards to real API
    ├── CostTracker     — per-provider USD counters, PRICING table
    └── /health /cost   — observability endpoints
         │
         ├── api.anthropic.com
         ├── api.openai.com
         ├── api.groq.com
         ├── generativelanguage.googleapis.com
         ├── localhost:11434  (Ollama)
         └── api.mistral.ai

Environment Variables

ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GROQ_API_KEY=gsk_...
GOOGLE_API_KEY=AIza...
MISTRAL_API_KEY=...
OLLAMA_HOST=http://localhost:11434   # optional, default: localhost:11434

Integration with nano-* Ecosystem

nano-proxy is the single routing layer for all nano-* projects. Each project supports NANO_PROXY_URL env var:

# Set once
export NANO_PROXY_URL=http://localhost:8765

# nano-eval routes through proxy automatically
nano-eval run configs/example.yaml

# nano-memory embeddings route through proxy
# nano-orchestrator agent calls route through proxy
import os
import anthropic

base_url = os.environ.get("NANO_PROXY_URL")
client = anthropic.Anthropic(
    base_url=f"{base_url}/anthropic" if base_url else None
)

CLI Reference

nano-proxy start                          # Start on default port 8765
nano-proxy start --config proxy.yaml     # Start with config file
nano-proxy start --host 0.0.0.0 --port 9000  # Custom host/port
nano-proxy status                         # Check if proxy is running
nano-proxy cost                           # Show cost breakdown

Contributing

git clone https://github.com/ghanibot/nano-proxy
cd nano-proxy
pip install -e ".[dev]"
pytest

License

MIT — see LICENSE

About

Lightweight LLM API proxy — routing, fallback, cost tracking for 6 providers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages