Agentz-Proxy: Native Python AI Router

Production-ready AI orchestration with intelligent cost optimization. Zero infrastructure cost.

Quick Start

# Make API requests:
curl -X POST http://100.96.197.84:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-oracle1-master" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "chat-best",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 100
  }'

# Or use with Claude Code:
export ANTHROPIC_API_BASE_URL="http://100.96.197.84:4000"
export ANTHROPIC_API_KEY="sk-oracle1-master"
claude-code "write code"

Features

✅ Intelligent Routing - Berkeley RouteLLM for cost optimization (47%+ savings)
✅ Native Python - SQLAlchemy ORM, no Prisma, no TypeScript dependencies
✅ ARM64 Optimized - Runs perfectly on Oracle Cloud Ampere A1
✅ OpenAI Compatible - Drop-in replacement for OpenAI API
✅ PostgreSQL Logging - All requests tracked with metrics
✅ Self-Hosted CI/CD - GitHub Actions runner on oracle1
✅ Free Infrastructure - $0/month on Oracle Cloud Always Free

Architecture

Client → RouteLLM Proxy (4000) → OpenRouter API
           ↓
    PostgreSQL (logging)
    Redis (caching)
    Prometheus (metrics)
    Grafana (dashboards)

Routing Strategy

Query Length	Model	Cost/M Tokens	Latency
< 100 chars	Gemini Flash	$0.075/$0.30	~150ms
100-500 chars	GPT-4o-mini	$0.15/$0.60	~300ms
> 500 chars	Claude Sonnet	$3.00/$15.00	~400ms

Cost Breakdown (60K requests/month)

Without routing:     $126/month (all Claude)
With RouteLLM:       $67/month (47% savings)
With caching (40%):  $40/month (68% savings)

Infrastructure: $0 (Oracle Always Free)

Stack

Services:

RouteLLM Proxy - Python 3.11, FastAPI, SQLAlchemy
PostgreSQL 16 - Request logging
Redis 7 - Caching layer
Prometheus - Metrics collection
Grafana - Visualization
GitHub Actions Runner - Auto-deployment

Infrastructure:

Oracle Cloud Ampere A1 (ARM64)
4 cores, 24GB RAM, 194GB storage
Tailscale private network
GitHub Pro+ (Actions, GHCR, Pages)

Repository Structure

agentz-proxy/
├── services/
│   ├── routellm-proxy/          # Main AI router service
│   ├── attribution-logger/      # Metrics collector
│   └── routing-engine/          # Performance analyzer
├── infra/
│   └── oracle1/
│       ├── docker-compose.yml   # Service orchestration
│       └── configs/             # Service configs
├── docs/                        # Documentation
└── .github/workflows/           # CI/CD pipelines

Documentation

Final Architecture - Complete system overview
Hardware Architecture - Physical deployment
GitHub Pro+ Features - GitHub integration
Self-Hosted Runner - CI/CD setup
Claude Code Usage - IDE integration

Deployment

Automated (Recommended)

git push
# GitHub Actions automatically:
# 1. Runs CI (lint, test, security scan)
# 2. Builds Docker images
# 3. Deploys to oracle1 via self-hosted runner
# Total time: ~2 minutes

Manual

ssh oracle1
cd ~/agentz-proxy
git pull
cd infra/oracle1
docker compose up -d

Monitoring

# Health check
curl http://100.96.197.84:4000/health

# Grafana dashboards
open http://100.96.197.84:3000

# Database metrics
ssh oracle1
docker exec agentz-postgres psql -U litellm -d agentz -c \
  'SELECT model_routed, COUNT(*), AVG(latency_ms) FROM request_logs GROUP BY model_routed;'

Development

# Local development (MacBook M3)
git clone https://github.com/aahmed954/agentz-proxy.git
cd agentz-proxy

# Make changes
# ... edit code ...

# Test against oracle1
curl http://100.96.197.84:4000/health

# Deploy
git add .
git commit -m "feat: add feature"
git push

# Monitor
gh run watch

Tech Stack Highlights

Pure Python (No Binaries):

FastAPI, SQLAlchemy, asyncpg, httpx
RouteLLM, Pydantic, Redis
All ARM64 native wheels

No Complexity:

❌ No Prisma
❌ No TypeScript
❌ No binary compatibility issues
❌ No migration headaches

Standard Tools:

✅ SQLAlchemy (Python ORM standard)
✅ PostgreSQL (industry standard)
✅ FastAPI (modern Python web framework)
✅ Docker Compose (simple orchestration)

Performance

Target: Sub-500ms for 90% of requests
Actual: 260-700ms depending on model

Breakdown:
- Routing: <1ms (RouteLLM)
- Network: 50-100ms (Tailscale)
- Model TTFT: 150-600ms (varies by model)

Throughput: 100+ req/sec on 4 ARM cores

Contributing

Fork the repository
Create feature branch (git checkout -b feature/amazing)
Commit changes (git commit -m 'feat: add amazing')
Push to branch (git push origin feature/amazing)
Open Pull Request

License

MIT License - See LICENSE file

Acknowledgments

Berkeley RouteLLM - Intelligent routing library
Oracle Cloud - Free ARM64 infrastructure
OpenRouter - Multi-provider AI API
GitHub - CI/CD, Container Registry, Pages

Built with native Python. No Prisma. No complexity. Just works. 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
.claude-flow/metrics		.claude-flow/metrics
.claude		.claude
.github		.github
.serena		.serena
claudedocs		claudedocs
docs		docs
infra/oracle1		infra/oracle1
memory		memory
scripts		scripts
services		services
tools		tools
.gitignore		.gitignore
AGENTS.md		AGENTS.md
BLEEDING_EDGE_CODING_BEAST.md		BLEEDING_EDGE_CODING_BEAST.md
CLAUDE.md		CLAUDE.md
CLAUDE_FLOW_EXTREME_OPTIMIZATIONS.md		CLAUDE_FLOW_EXTREME_OPTIMIZATIONS.md
CLAUDE_FLOW_INTEGRATION.md		CLAUDE_FLOW_INTEGRATION.md
CLAUDE_FLOW_RULEBOOK_AI_INTEGRATION.md		CLAUDE_FLOW_RULEBOOK_AI_INTEGRATION.md
CODE_CONTEXT_CLAUDE_FLOW_GUIDE.md		CODE_CONTEXT_CLAUDE_FLOW_GUIDE.md
COMPLETE_SYSTEM_ARCHITECTURE.md		COMPLETE_SYSTEM_ARCHITECTURE.md
DEBUG_REMAINING_ISSUES.md		DEBUG_REMAINING_ISSUES.md
DEEP_RESEARCH_SERVICE_FAILURES.md		DEEP_RESEARCH_SERVICE_FAILURES.md
DEPLOYMENT_COMPLETE.md		DEPLOYMENT_COMPLETE.md
DEPLOYMENT_SUMMARY.md		DEPLOYMENT_SUMMARY.md
DUAL_SYSTEM_CHEATSHEET.md		DUAL_SYSTEM_CHEATSHEET.md
FINAL_DEPLOYMENT_PLAN.md		FINAL_DEPLOYMENT_PLAN.md
GETTING_STARTED.md		GETTING_STARTED.md
IFLOW.md		IFLOW.md
INFRASTRUCTURE_HEALTH_REPORT.md		INFRASTRUCTURE_HEALTH_REPORT.md
INTEGRATION_COMPLETE.md		INTEGRATION_COMPLETE.md
MASTER_IMPLEMENTATION_PLAN.md		MASTER_IMPLEMENTATION_PLAN.md
MCP_TOOLS_STATUS.md		MCP_TOOLS_STATUS.md
MULTI_NODE_JARVIS_GUIDE.md		MULTI_NODE_JARVIS_GUIDE.md
NEXT_SESSION_PICKUP.md		NEXT_SESSION_PICKUP.md
PICKUP_SESSION_OCT_2_EVENING.md		PICKUP_SESSION_OCT_2_EVENING.md
QUICK_REFERENCE.md		QUICK_REFERENCE.md
QUICK_START.md		QUICK_START.md
README.md		README.md
RULEBOOK_AI_SETUP_SUMMARY.md		RULEBOOK_AI_SETUP_SUMMARY.md
SECURITY_AUDIT_REPORT.md		SECURITY_AUDIT_REPORT.md
SERVICE_FIXES_IN_PROGRESS.md		SERVICE_FIXES_IN_PROGRESS.md
SESSION_COMPLETE_OCT_2_EVENING.md		SESSION_COMPLETE_OCT_2_EVENING.md
SESSION_SUMMARY_OCT_2_2025.md		SESSION_SUMMARY_OCT_2_2025.md
SSH_SETUP_GUIDE.md		SSH_SETUP_GUIDE.md
SWARM_AUDIT_MONITOR.sh		SWARM_AUDIT_MONITOR.sh
SWARM_MONITOR_LIVE.sh		SWARM_MONITOR_LIVE.sh
TAVILY_MCP_SETUP.md		TAVILY_MCP_SETUP.md
TAVILY_STATUS.md		TAVILY_STATUS.md
THANOS_PICKUP_PROMPT.md		THANOS_PICKUP_PROMPT.md
TONIGHTS_FINAL_SUMMARY.md		TONIGHTS_FINAL_SUMMARY.md
claude-flow.config.json		claude-flow.config.json
mkdocs.yml		mkdocs.yml
optimize-claude-flow-safe.sh		optimize-claude-flow-safe.sh
optimize-claude-flow.sh		optimize-claude-flow.sh
quick-setup.sh		quick-setup.sh
test_tavily_mcp.sh		test_tavily_mcp.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Agentz-Proxy: Native Python AI Router

Quick Start

Features

Architecture

Routing Strategy

Cost Breakdown (60K requests/month)

Stack

Repository Structure

Documentation

Deployment

Automated (Recommended)

Manual

Monitoring

Development

Tech Stack Highlights

Performance

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

aahmed954/agentz-proxy

Folders and files

Latest commit

History

Repository files navigation

Agentz-Proxy: Native Python AI Router

Quick Start

Features

Architecture

Routing Strategy

Cost Breakdown (60K requests/month)

Stack

Repository Structure

Documentation

Deployment

Automated (Recommended)

Manual

Monitoring

Development

Tech Stack Highlights

Performance

Contributing

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

Packages