Stop burning money on AI tokens. Ship reliable agents that won't break in production.
Shannon is battle-tested infrastructure for AI agents that solves the problems you'll hit at scale: runaway costs, non-deterministic failures, and security nightmares. Built on Temporal workflows and WASI sandboxing, it's the platform we wished existed when our LLM bills hit $50k/month.
Real-time observability dashboard showing agent traffic control, metrics, and event streams
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ Please โญ star this repo to show your support and stay updated! โญ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
- Zero-Token Templates โ YAML workflows eliminate LLM calls for common patterns (โ Template Guide, โ Getting Started)
- DAG nodes for parallel execution with dependency resolution
- Supervisor nodes for hierarchical multi-agent coordination
- Template inheritance for reusable workflow composition
- Automatic pattern degradation when budget constrained
- Learning Router โ UCB algorithm selects optimal strategies, up to 85-95% token savings in internal testing (โ Details)
- Rate-Aware Execution โ Provider-specific RPM/TPM limits prevent throttling (โ Rate Control)
- Automatic multiโagent orchestration โ Describe the goal; Shannon decomposes into subtasks and schedules DAG execution with dependencies resolved.
- Plugโandโplay tools โ Add REST APIs via MCP or OpenAPI, or write Python tools; no proto/Rust/Go changes needed (โ Guide). Domain-specific integrations via vendor adapter pattern (โ Vendor Guide).
- Multiple AI patterns โ ReAct, ChainโofโThought, TreeโofโThoughts, Debate, Reflection (selectable via
cognitive_strategy). - Timeโtravel debugging โ Export and replay any workflow to reproduce exact agent behavior.
- Hot configuration โ Live reload for model pricing and OPA policies (config/models.yaml, config/opa/policies).
- WASI sandbox for code โ CPython 3.11 in a WASI sandbox (stdlib, no network, readโonly FS). See Python Code Execution.
- Token budget control โ Hard perโagent/perโtask budgets with live usage tracking and enforcement.
- Policy engine (OPA) โ Fineโgrained rules for tools, models, and data; hotโreload policies; approvals at
/approvals/decision. - Multiโtenancy โ Tenantโscoped auth, sessions, memory, and workflows with isolation guarantees.
- Cost optimization โ Caching, session persistence, context shaping, and budgetโaware routing.
- Provider support โ OpenAI, Anthropic, Google (Gemini), Groq, plus OpenAIโcompatible endpoints (e.g., DeepSeek, Qwen, Ollama). Centralized pricing via
config/models.yaml. - Observable by default โ Realโtime dashboard, Prometheus metrics, OpenTelemetry tracing.
- Distributed by design โ Temporalโbacked workflows with horizontal scaling.
- Clean State-Compute Separation โ Go Orchestrator owns all persistent state (Qdrant vector store, session memory); Python LLM Service is stateless compute (provider abstraction with exact-match caching only).
- Comprehensive memory โ Session memory in Redis + vector memory in Qdrant with MMRโbased diversity; optional hierarchical recall in workflows (all managed by Go).
- Continuous learning โ Records decomposition and failure patterns for future planning and mitigation; learns across sessions to improve strategy selection.
- Slidingโwindow shaping โ Primers + previous summary + recents, with tokenโaware budgets and live progress events.
- Details: see docs/context-window-management.md and docs/llm-service-caching.md
Model pricing is centralized in config/models.yaml - all services load from this single source for consistent cost tracking.
| Challenge | Shannon | LangGraph | AutoGen | CrewAI |
|---|---|---|---|---|
| Multi-Agent Orchestration | โ DAG/Graph workflows | โ Stateful graphs | โ Group chat | โ Crew/roles |
| Agent Communication | โ Message passing | โ Tool calling | โ Conversations | โ Delegation |
| Memory & Context | โ Chunked storage (character-based), MMR diversity, decomposition/failure pattern learning | โ Multiple types | โ Conversation history | โ Shared memory |
| Debugging Production Issues | โ Replay any workflow | โ Limited debugging | โ Basic logging | โ |
| Token Cost Control | โ Hard budget limits | โ | โ | โ |
| Security Sandbox | โ WASI isolation | โ | โ | โ |
| Policy Control (OPA) | โ Fine-grained rules | โ | โ | โ |
| Deterministic Replay | โ Time-travel debugging | โ | โ | โ |
| Session Persistence | โ Redis-backed, durable | โ | ||
| Multi-Language | โ Go/Rust/Python | |||
| Production Metrics | โ Dashboard/Prometheus | โ | โ |
- Docker and Docker Compose
- Make, curl, grpcurl
- An API key for at least one supported LLM provider
Docker Setup Instructions (click to expand)
macOS:
# Install Docker Desktop from https://www.docker.com/products/docker-desktop/
# Or using Homebrew:
brew install --cask dockerLinux (Ubuntu/Debian):
# Install Docker Engine
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker $USER
# Log out and back in for group changes to take effect
# Install Docker Compose
sudo apt-get update
sudo apt-get install docker-compose-plugindocker --version
docker compose versionThe make dev command starts all services:
- PostgreSQL: Database on port 5432
- Redis: Cache on port 6379
- Qdrant: Vector store on port 6333
- Temporal: Workflow engine on port 7233 (UI on 8088)
- Orchestrator: Go service on port 50052
- Agent Core: Rust service on port 50051
- LLM Service: Python service on port 8000
- Gateway: REST API gateway on port 8080
- Dashboard: Real-time observability UI on port 2111
git clone https://github.com/Kocoro-lab/Shannon.git
cd Shannon
# One-stop setup: creates .env, generates protobuf files
make setup
# Add your LLM API key to .env
echo "OPENAI_API_KEY=your-key-here" >> .env
# Download Python WASI interpreter for secure code execution (20MB)
./scripts/setup_python_wasi.sh
# Start all services and verify
make dev
make smoke
# (Optional) Start Grafana & Prometheus monitoring
cd deploy/compose/grafana && docker compose -f docker-compose-grafana-prometheus.yml up -dShannon provides multiple ways to interact with your AI agents:
# Open the Shannon Dashboard in your browser
open http://localhost:2111
# The dashboard provides:
# - Visual task submission interface
# - Real-time event streaming
# - System metrics and monitoring
# - Task history and results# For development (no auth required)
export GATEWAY_SKIP_AUTH=1
# Submit a task via API
curl -X POST http://localhost:8080/api/v1/tasks \
-H "Content-Type: application/json" \
-d '{
"query": "Analyze the sentiment of: Shannon makes AI agents simple!",
"session_id": "demo-session-123"
}'
# Response includes workflow_id for tracking
# {"workflow_id":"task-dev-1234567890","status":"running"}pip install shannon-sdkfrom shannon import ShannonClient
with ShannonClient(grpc_endpoint="localhost:50052",
http_endpoint="http://localhost:8081") as client:
handle = client.submit_task("Analyze: Shannon vs AgentKit", user_id="demo")
status = client.wait(handle.task_id)
print(status.status.value, status.result)CLI is also available after install: shannon --endpoint localhost:50052 submit "Hello".
# Stream live events as your agent works (replace with your workflow_id)
curl -N http://localhost:8081/stream/sse?workflow_id=task-dev-1234567890
# You'll see human-readable events like:
# event: AGENT_THINKING
# data: {"message":"Analyzing sentiment: Shannon makes AI agents simple!"}
#
# event: TOOL_INVOKED
# data: {"message":"Processing natural language sentiment analysis"}
#
# event: AGENT_COMPLETED
# data: {"message":"Task completed successfully"}# Check final status and result
curl http://localhost:8080/api/v1/tasks/task-dev-1234567890
# Response includes status, result, tokens used, and metadataFor production, use API keys instead of GATEWAY_SKIP_AUTH:
# Create an API key (one-time setup)
make seed-api-key # Creates test key: sk_test_123456
# Use in requests
curl -X POST http://localhost:8080/api/v1/tasks \
-H "X-API-Key: sk_test_123456" \
-H "Content-Type: application/json" \
-d '{"query":"Your task here"}'# Soft-delete a session you own (idempotent, returns 204)
curl -X DELETE http://localhost:8080/api/v1/sessions/<SESSION_UUID> \
-H "X-API-Key: sk_test_123456"
# Notes:
# - Marks the session as deleted (deleted_at/deleted_by); data remains in DB
# - Deleted sessions are excluded from reads and cannot be fetched
# - Redis cache for the session is clearedAdvanced Methods: Scripts, gRPC, and Command Line (click to expand)
# Submit a simple task
./scripts/submit_task.sh "Analyze the sentiment of: 'Shannon makes AI agents simple!'"
# Check session usage and token tracking (session ID is in SubmitTask response message)
grpcurl -plaintext \
-d '{"sessionId":"YOUR_SESSION_ID"}' \
localhost:50052 shannon.orchestrator.OrchestratorService/GetSessionContext
# Export and replay a workflow history (use the workflow ID from submit_task output)
./scripts/replay_workflow.sh <WORKFLOW_ID># Submit via gRPC
grpcurl -plaintext \
-d '{"metadata":{"userId":"user1","sessionId":"test-session"},"query":"Analyze sentiment"}' \
localhost:50052 shannon.orchestrator.OrchestratorService/SubmitTask
# Stream events via gRPC
grpcurl -plaintext \
-d '{"workflowId":"task-dev-1234567890"}' \
localhost:50052 shannon.orchestrator.StreamingService/StreamTaskExecution# Connect to WebSocket for bidirectional streaming
# Via admin port (no auth):
wscat -c ws://localhost:8081/stream/ws?workflow_id=task-dev-1234567890
# Or via gateway (with auth):
# wscat -c ws://localhost:8080/api/v1/stream/ws?workflow_id=task-dev-1234567890 \
# -H "Authorization: Bearer YOUR_API_KEY"# Access Shannon Dashboard for real-time monitoring
open http://localhost:2111
# Dashboard features:
# - Real-time task execution and event streams
# - System metrics and performance graphs
# - Token usage tracking and budget monitoring
# - Agent traffic control visualization
# - Interactive command execution
# Access Temporal Web UI for workflow debugging
open http://localhost:8088
# Temporal UI provides:
# - Workflow execution history and timeline
# - Task status, retries, and failures
# - Input/output data for each step
# - Real-time workflow progress
# - Search workflows by ID, type, or statusThe visual tools provide comprehensive monitoring:
- Shannon Dashboard (http://localhost:2111) - Real-time agent traffic control, metrics, and events
- Temporal UI (http://localhost:8088) - Workflow debugging and state inspection
- Grafana (http://localhost:3000) - System metrics visualization with Prometheus (optional, see monitoring setup)
- Prometheus (http://localhost:9090) - Metrics collection and querying (optional)
- Combined view - Full visibility into your AI agents' behavior and system performance
Click each example below to expand. These showcase Shannon's unique features that set it apart from other frameworks.
Example 1: Cost-Controlled Customer Support
curl -X POST http://localhost:8080/api/v1/tasks \
-H "Content-Type: application/json" \
-d '{
"query": "Help me troubleshoot my deployment issue",
"session_id": "user-123-session"
}'Key features:
- Session persistence - Maintains conversation context across requests
- Token tracking - Every request returns token usage and costs
- Policy control - Apply OPA policies for allowed actions (see Example 3)
- Result: Up to 70% cost reduction through smart caching and session management (based on internal testing)
Example 2: Debugging Production Failures
# Production agent failed at 3am? No problem.
# Export and replay the workflow in one command
./scripts/replay_workflow.sh task-prod-failure-123
# Or specify a particular run ID
./scripts/replay_workflow.sh task-prod-failure-123 abc-def-ghi
# Output shows step-by-step execution with token counts, decisions, and state changes
# Fix the issue, add a test case, never see it againExample 3: Multi-Team Model Governance
# config/opa/policies/data-science.rego
package shannon.teams.datascience
default allow = false
allow {
input.team == "data-science"
input.model in ["gpt-4o", "claude-3-sonnet"]
}
max_tokens = 50000 {
input.team == "data-science"
}
# config/opa/policies/customer-support.rego
package shannon.teams.support
default allow = false
allow {
input.team == "support"
input.model == "gpt-4o-mini"
}
max_tokens = 5000 {
input.team == "support"
}
deny_tool["database_write"] {
input.team == "support"
}Example 4: Security-First Code Execution
# Python code runs in isolated WASI sandbox with full standard library
./scripts/submit_task.sh "Execute Python: print('Hello from secure WASI!')"
# Even malicious code is safe
./scripts/submit_task.sh "Execute Python: import os; os.system('rm -rf /')"
# Result: OSError - system calls blocked by WASI sandbox
# Advanced: Session persistence for data analysis
./scripts/submit_task.sh "Execute Python with session 'analysis': data = [1,2,3,4,5]"
./scripts/submit_task.sh "Execute Python with session 'analysis': print(sum(data))"
# Output: 15Example 5: Human-in-the-Loop Approval
# Configure approval for high-complexity or dangerous operations
cat > config/features.yaml << 'EOF'
workflows:
approval:
enabled: true
complexity_threshold: 0.7 # Require approval for complex tasks
dangerous_tools: ["file_delete", "database_write", "api_call"]
EOF
# Submit a complex task that triggers approval
./scripts/submit_task.sh "Delete all temporary files older than 30 days from /tmp"
# Workflow pauses and waits for human approval
# Check Temporal UI: http://localhost:8088
# Approve via signal: temporal workflow signal --workflow-id <ID> --name approval --input '{"approved":true}'Unique to Shannon: Configurable approval workflows based on complexity scoring and tool usage.
Example 6: Multi-Agent Memory & Learning
# Agent learns from conversation and applies knowledge
SESSION="learning-session-$(date +%s)"
# Agent learns your preferences
./scripts/submit_task.sh "I prefer Python over Java for data science" "$SESSION"
./scripts/submit_task.sh "I like using pandas and numpy for analysis" "$SESSION"
./scripts/submit_task.sh "My projects usually involve machine learning" "$SESSION"
# Later, agent recalls and applies this knowledge
./scripts/submit_task.sh "What language and tools should I use for my new data project?" "$SESSION"
# Response includes personalized recommendations based on learned preferences
# Check memory storage (character-based chunking with MMR diversity)
grpcurl -plaintext -d "{\"sessionId\":\"$SESSION\"}" \
localhost:50052 shannon.orchestrator.OrchestratorService/GetSessionContextUnique to Shannon: Persistent memory with intelligent chunking (4 chars โ 1 token) and MMR diversity ranking.
Example 7: Supervisor Workflow with Dynamic Strategy
# Complex task automatically delegates to multiple specialized agents
./scripts/submit_task.sh "Analyze our website performance, identify bottlenecks, and create an optimization plan with specific recommendations"
# Watch the orchestration in real-time
curl -N "http://localhost:8081/stream/sse?workflow_id=<WORKFLOW_ID>"
# Events show:
# - Complexity analysis (score: 0.85)
# - Strategy selection (supervisor pattern chosen)
# - Dynamic agent spawning (analyzer, investigator, planner)
# - Parallel execution with coordination
# - Synthesis and quality reflectionUnique to Shannon: Automatic workflow pattern selection based on task complexity.
Example 8: Time-Travel Debugging with State Inspection
# Production issue at 3am? Debug it step-by-step
FAILED_WORKFLOW="task-prod-failure-20250928-0300"
# Export with full state history
./scripts/replay_workflow.sh export $FAILED_WORKFLOW debug.json
# Inspect specific decision points
go run ./tools/replay -history debug.json -inspect-step 5
# Modify and test fix locally
go run ./tools/replay -history debug.json -override-activity GetLLMResponse
# Validate fix passes all historical workflows
make ci-replayUnique to Shannon: Complete workflow state inspection and modification for debugging.
Example 9: Token Budget with Circuit Breakers
# Set strict budget with automatic fallbacks
curl -X POST http://localhost:8080/api/v1/tasks \
-H "Content-Type: application/json" \
-H "X-API-Key: sk_test_123456" \
-d '{
"query": "Generate a comprehensive market analysis report",
"session_id": "budget-test",
"config": {
"budget": {
"max_tokens": 5000,
"fallback_model": "gpt-4o-mini",
"circuit_breaker": {
"threshold": 0.8,
"cooldown_seconds": 60
}
}
}
}'
# System automatically:
# - Switches to cheaper model when 80% budget consumed
# - Implements cooldown period to prevent runaway costs
# - Returns partial results if budget exhaustedUnique to Shannon: Real-time budget enforcement with automatic degradation.
Example 10: Multi-Tenant Agent Isolation
# Each tenant gets isolated agents with separate policies
# Tenant A: Data Science team
curl -X POST http://localhost:8080/api/v1/tasks \
-H "X-API-Key: sk_tenant_a_key" \
-H "X-Tenant-ID: data-science" \
-d '{"query": "Train a model on our dataset"}'
# Tenant B: Customer Support
curl -X POST http://localhost:8080/api/v1/tasks \
-H "X-API-Key: sk_tenant_b_key" \
-H "X-Tenant-ID: support" \
-d '{"query": "Access customer database"}' # Denied by OPA policy
# Complete isolation:
# - Separate memory/vector stores per tenant
# - Independent token budgets
# - Custom model access
# - Isolated session managementUnique to Shannon: Enterprise-grade multi-tenancy with OPA policy enforcement.
More Production Examples (click to expand)
- Incident Response Bot: Auto-triages alerts with budget limits
- Code Review Agent: Enforces security policies via OPA rules
- Data Pipeline Monitor: Replays failed workflows for debugging
- Compliance Auditor: Full trace of every decision and data access
- Multi-Tenant SaaS: Complete isolation between customer agents
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ Client โโโโโโถโ Orchestrator โโโโโโถโ Agent Core โ
โโโโโโโโโโโโโโโ โ (Go) โ โ (Rust) โ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ โ
โผ โผ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ Temporal โ โ WASI Tools โ
โ Workflows โ โ Sandbox โ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโ
โ LLM Service โ
โ (Python) โ
โโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CLIENT LAYER โ
โโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโค
โ HTTP โ gRPC โ SSE โ WebSocket โ
โโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ORCHESTRATOR (Go) โ
โ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ Router โโโโ Budget โโโโ Session โโโโ OPA โ โ
โ โ โ โ Manager โ โ Store โ โ Policies โ โ
โ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ โ โ
โผ โผ โผ โผ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ Temporal โ โ Redis โ โ PostgreSQL โ โ Qdrant โ
โ Workflows โ โ Cache โ โ State โ โ Vectors โ
โ โ โ Sessions โ โ History โ โ Memory โ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ AGENT CORE (Rust) โ
โ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโ โ
โ โ WASI โโโโ Policy โโโโ Tool โโโโ Agent โ โ
โ โ Sandbox โ โ Enforcer โ โ Registry โ โ Comms โ โ
โ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โผ โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ LLM SERVICE (Python) โ โ OBSERVABILITY LAYER โ
โ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โ โ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โ
โ โ Provider โ โ MCP โ โ โ โ Prometheus โ โ OpenTel โ โ
โ โ Adapter โ โ Tools โ โ โ โ Metrics โ โ Traces โ โ
โ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โ โ โโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
- Orchestrator (Go): Task routing, budget enforcement, session management, OPA policy evaluation
- Agent Core (Rust): WASI sandbox execution, policy enforcement, agent-to-agent communication
- LLM Service (Python): Provider abstraction (15+ LLMs), MCP tools, prompt optimization
- Gateway (Go): REST API, authentication, rate limiting, request validation
- Dashboard (React/Next.js): Real-time monitoring, metrics visualization, event streaming
- Data Layer: PostgreSQL (workflow state), Redis (session cache), Qdrant (vector memory)
- Observability: Built-in dashboard, Prometheus metrics, OpenTelemetry tracing
# Clone and configure
git clone https://github.com/Kocoro-lab/Shannon.git
cd Shannon
make setup-env
echo "OPENAI_API_KEY=sk-..." >> .env
# Launch
make dev
# Set budgets per request (see "Examples That Actually Matter" section)
# Configure in SubmitTask payload: {"budget": {"max_tokens": 5000}}# Create your first OPA policy
cat > config/opa/policies/default.rego << EOF
package shannon
default allow = false
# Allow all for dev, restrict in prod
allow {
input.environment == "development"
}
# Production rules
allow {
input.environment == "production"
input.tokens_requested < 10000
input.model in ["gpt-4o-mini", "claude-4-haiku"]
}
EOF
# Hot reload - no restart needed!# Something went wrong in production?
# 1. Find the workflow ID from logs
grep ERROR logs/orchestrator.log | tail -1
# 2. Export the workflow
./scripts/replay_workflow.sh export task-xxx-failed debug.json
# 3. Replay locally to see exactly what happened
./scripts/replay_workflow.sh replay debug.json
# 4. Fix, test, deploy with confidence# config/teams.yaml
teams:
data-science:
models: ["gpt-4o", "claude-4-sonnet"]
max_tokens_per_day: 1000000
tools: ["*"]
customer-support:
models: ["gpt-4o-mini"]
max_tokens_per_day: 50000
tools: ["search", "respond", "escalate"]
engineering:
models: ["claude-4-sonnet", "gpt-4o"]
max_tokens_per_day: 500000
tools: ["code_*", "test_*", "deploy_*"]Shannon uses a layered configuration system with clear precedence:
- Environment Variables (
.env) - Highest priority, for secrets and deployment-specific settings - Docker Compose - Service configurations and port mappings
- YAML Files (
config/features.yaml) - Feature flags and default settings
Key configuration files:
config/features.yaml- Feature toggles, workflow settings, enforcement policiesconfig/models.yaml- LLM provider configuration and pricing.env- API keys and runtime overrides (see.env.example)
- What: Client-side response cache in the Python LLM service.
- Defaults: Inโmemory LRU with TTL from
config/models.yamlโprompt_cache.ttl_seconds(fallback 3600s). - Distributed: Set
REDIS_URL(orREDIS_HOST/REDIS_PORT/REDIS_PASSWORD) to enable Redisโbacked cache across instances. - Keying: Deterministic hash of messages + key params (tier, model override, temperature, max_tokens, functions, seed).
- Behavior: Nonโstreaming calls are cacheable; streaming uses cache to return the full result as a single chunk when available.
See: docs/llm-service-caching.md
For detailed configuration documentation, see config/README.md.
# Run linters and formatters
make lint
make fmt
# Run smoke tests
make smoke
# View logs
make logs
# Check service status
make psWe love contributions! Please see our Contributing Guide for details.
- Discord: Join our Discord
- Twitter/X: @shannon_agents
Now โ v0.1 (Production Ready)
- โ Core platform stable - Go orchestrator, Rust agent-core, Python LLM service
- โ Deterministic replay debugging - Export and replay any workflow execution
- โ OPA policy enforcement - Fine-grained security and governance rules
- โ WebSocket streaming - Real-time agent communication with event filtering and replay
- โ SSE streaming - Server-sent events for browser-native streaming
- โ WASI sandbox - Secure code execution environment with resource limits
- โ Multi-agent orchestration - DAG, parallel, sequential, hybrid (dependency-based), ReAct, Tree-of-Thoughts, Chain-of-Thought, Debate, Reflection patterns
- โ Vector memory - Qdrant-based semantic search and context retrieval
- โ Hierarchical memory - Recent + semantic retrieval with deduplication and compression
- โ Near-duplicate detection - 95% similarity threshold to prevent redundant storage
- โ Token-aware context management - Configurable windows (5-200 msgs), smart selection, sliding window compression
- โ Circuit breaker patterns - Automatic failure recovery and degradation
- โ Multi-provider LLM support - OpenAI, Anthropic, Google, DeepSeek, and more
- โ Token budget management - Per-agent and per-task limits with validation
- โ Session management - Durable state with Redis/PostgreSQL persistence
- โ Agent Coordination - Direct agent-to-agent messaging, dynamic team formation, collaborative planning
- โ MCP integration - Model Context Protocol support for standardized tool interfaces
- โ OpenAPI integration - REST API tools with retry logic, circuit breaker, and ~70% API coverage
- โ Provider abstraction layer - Unified interface for adding new LLM providers with automatic fallback
- โ Advanced Task Decomposition - Recursive decomposition with ADaPT patterns, chain-of-thought planning, task template library
- โ Composable workflows - YAML-based workflow templates with declarative orchestration patterns
- โ Unified Gateway & SDKs - REST API gateway, Python/TypeScript SDKs, CLI tool for easy adoption
- ๐ง Ship Docker Images - Pre-built docker release images, make setup staightforward
v0.2
- (Optional) Drag and Drop UI - AgentKit-like drag & drop UI to generate workflow yaml templates
- Native tool expansion - Additional Rust-native tools for file operations and system interactions
- Advanced Memory - Episodic rollups, entity/temporal knowledge graphs, hybrid dense+sparse retrieval
- Advanced Learning - Pattern recognition from successful workflows, contextual bandits for agent selection
- Agent Collaboration Foundation - Agent roles/personas, agent-specific memory, supervisor hierarchies
- MMR diversity reranking - Implement actual MMR algorithm for diverse retrieval (config ready, 40% done)
- Performance-based agent selection - Epsilon-greedy routing using agent_executions metrics
- Context streaming events - Add 4 new event types (CONTEXT_BUILDING, MEMORY_RECALL, etc.)
- Budget enforcement in supervisor - Pre-spawn validation and circuit breakers for multi-agent cost control
- Use case presets - YAML-based presets for debugging/analysis modes with preset selection logic
- Debate outcome persistence - Store consensus decisions in Qdrant for learning
- Shared workspace functions - Agent artifact sharing (AppendToWorkspace/ListWorkspaceItems)
- Intelligent Tool Selection - Semantic tool result caching, agent experience learning, performance-based routing
- Native RAG System - Document chunking service, knowledge base integration, context injection with source attribution
v0.3
- Solana Integration - Decentralized trust, on-chain attestation, and blockchain-based audit trails for agent actions
- Production Observability - Distributed tracing, custom Grafana dashboards, SLO monitoring
- Enterprise Features - SSO integration, multi-tenant isolation, approval workflows
- Edge Deployment - WASM execution in browser, offline-first capabilities
- Autonomous Intelligence - Self-organizing agent swarms, critic/reflection loops, group chat coordination
- Cross-Organization Federation - Secure agent communication across tenants, capability negotiation protocols
- Regulatory & Compliance - SOC 2, GDPR, HIPAA automation with audit trails
- AI Safety Frameworks - Constitutional AI, alignment mechanisms, adversarial testing
- Personalized Model Training - Learn from each user's successful task patterns, fine-tune models on user-specific interactions, apply trained models during agent inference
- Python Code Execution - Secure Python execution via WASI sandbox
- Multi-Agent Workflows - Orchestration patterns and best practices
- Pattern Usage Guide - ReAct, Tree-of-Thoughts, Debate patterns
- Streaming APIs - Real-time agent output streaming
- Authentication & Access Control - Multi-tenancy and OPA policies
- Memory System - Session + vector memory (Qdrant), MMR diversity, pattern learning
- System Prompts - Priority, role presets, and template variables
- Extending Shannon - Ways to extend templates, decomposition, and tools
- Adding Custom Tools - Complete guide for MCP, OpenAPI, and built-in tools
- Agent Core API - Rust service endpoints
- Orchestrator Service - Workflow management and patterns
- LLM Service API - Provider abstraction
- ๐ Found a bug? Open an issue
- ๐ก Have an idea? Start a discussion
- ๐ฌ Need help? Join our Discord
- โญ Like the project? Give us a star!
We're building decentralized trust infrastructure with Solana blockchain:
- Cryptographic Verification: On-chain attestation of AI agent actions and results
- Immutable Audit Trail: Blockchain-based proof of task execution
- Smart Contract Interoperability: Enable AI agents to interact with DeFi and Web3 protocols
- Token-Gated Capabilities: Control agent permissions through blockchain tokens
- Decentralized Reputation: Build trust through verifiable on-chain agent performance
Stay tuned for our Web3 trust layer - bringing transparency and verifiability to AI systems!
Shannon builds upon and integrates amazing work from the open-source community:
- Agent Traffic Control - The original inspiration for our retro terminal UI design and agent visualization concept
- Model Context Protocol (MCP) - Anthropic's protocol for standardized LLM-tool interactions
- Claude Code - Used extensively in developing Shannon's codebase
- Temporal - The bulletproof workflow orchestration engine powering Shannon's reliability
- LangGraph - Inspiration for stateful agent architectures
- AutoGen - Microsoft's multi-agent conversation framework
- WASI - WebAssembly System Interface for secure code execution
- Open Policy Agent - Policy engine for fine-grained access control
Special thanks to all our contributors and the broader AI agent community for feedback, bug reports, and feature suggestions.
MIT License - Use it anywhere, modify anything, zero restrictions. See LICENSE.
Stop debugging AI failures. Start shipping reliable agents.
Discord โข
GitHub
If Shannon saves you time or money, let us know! We love success stories.
Twitter/X: @shannon_agents