Architecture Overview
AxonFlow is an enterprise AI control plane that sits between your applications and AI infrastructure, providing low-latency policy enforcement, multi-agent orchestration, and permission-aware data access.
High-Level Architecture
Request Flow
When your application sends a request to AxonFlow:
Core Components
Agent Service
The Agent service is the primary entry point for policy enforcement and AI agent execution.
Key Features:
- Single-digit ms P95 policy evaluation latency
- TLS 1.3 encryption for all connections
- Stateless architecture for horizontal scaling
- Built-in health checks and auto-recovery
Specifications:
- Default Count: 5 tasks (configurable 1-50)
- CPU: 1 vCPU per task
- Memory: 2 GB per task
- Port: 8080
- Protocol: HTTPS
Responsibilities:
- Static policy evaluation (PII, SQLi, rate limits)
- Request validation and authentication
- MCP connector orchestration with SQLi response scanning
- Code governance detection
- Audit logging with decision chain tracking
Orchestrator Service
The Orchestrator service handles LLM routing, dynamic policies, and multi-agent coordination.
Key Features:
- Multi-Agent Planning (MAP) for parallel execution
- Pluggable LLM provider system (OpenAI, Anthropic, Azure, Gemini, Ollama, Bedrock)
- Dynamic policy evaluation
- Cost controls and budget management
Specifications:
- Default Count: 10 tasks (configurable 1-50)
- CPU: 1 vCPU per task
- Memory: 2 GB per task
- Port: 8081
- Protocol: HTTP
Responsibilities:
- LLM routing and failover
- Dynamic policy enforcement
- Cost tracking and budget enforcement
- Execution replay and debugging
- Workflow orchestration
Database Layer
PostgreSQL database for persistent storage and audit trails.
Specifications:
- Engine: PostgreSQL 15.4
- Deployment: Multi-AZ for high availability
- Storage: 100 GB GP3 (expandable)
- Encryption: At-rest encryption enabled
- Backups: Automated daily backups (7-day retention)
Data Stored:
- Policy definitions and versions
- Audit logs and compliance trails
- MCP connector configurations
- Agent execution history
- Performance metrics
Application Load Balancer
Internal ALB for routing and SSL termination.
Features:
- TLS 1.3 encryption
- Health checks with automatic failover
- Connection draining for zero-downtime updates
Configuration:
- Type: Application Load Balancer
- Scheme: Internal (in-VPC only)
- Protocol: HTTPS (443)
- Target: Agent service (port 8080)
Feature Architecture
Policy Engine
AxonFlow uses a two-phase policy evaluation model for optimal performance and flexibility:
Phase 1 (Agent - Static Policies):
- Evaluated synchronously in single-digit milliseconds
- System-level rules: PII detection, SQL injection, rate limiting
- No external context required — cached and compiled for speed
Phase 2 (Orchestrator - Dynamic Policies):
- Evaluated after static policies pass
- Tenant-specific and tier-aware rules
- Can query external context (user roles, org settings)
Security Detection
PII Detection:
- Critical PII (redacts by default): SSN, credit cards, Aadhaar, PAN, passport numbers
- Non-critical PII (warns): Email, phone, names
- Configurable actions:
block,redact,warn,logviaPII_ACTIONenv var - Regional patterns: US (SSN), India (Aadhaar, PAN), EU (national IDs)
SQL Injection Scanning:
- Input scanning: Detects SQLi payloads in user queries before LLM processing
- Response scanning: Scans MCP connector responses for SQLi in returned data
- Modes: Basic (pattern-based), Advanced (enterprise, deeper analysis)
- Default action: Block (high-confidence attacks), configurable via
SQLI_ACTIONenv var
Code Governance:
- Detects code artifacts in LLM responses (14+ languages)
- Identifies: Language, code type, potential secrets, unsafe patterns (eval, shell injection)
- Logs metadata for compliance audits
Governance Features
Decision Chain Tracking:
- Records complete audit trail of AI decisions (EU AI Act Article 12 compliance)
- Captures: Policy evaluations, LLM generations, data retrievals, human approvals
- SHA-256 hash verification for tamper detection
- Async persistence with configurable worker pool
Cost Controls:
- Budget scopes: Organization, Team, Agent, Workflow, User
- Budget periods: Daily, Weekly, Monthly, Quarterly, Yearly
- Actions on exceed: Warn, Block, Downgrade (Enterprise)
- Real-time usage tracking across all LLM providers
Execution Replay:
- Replay specific requests with full decision chain visibility
- Deterministic timeline for debugging governed workflows
- Compliance-grade exports for audits
Integration Modes
AxonFlow supports two integration patterns:
| Mode | Use Case | Flow |
|---|---|---|
| Proxy Mode | New projects, full governance | App → AxonFlow → LLM → AxonFlow → App |
| Gateway Mode | Existing stacks (LangChain, CrewAI) | Pre-check → Your LLM call → Audit |
Proxy Mode routes all LLM traffic through AxonFlow for complete lifecycle control.
Gateway Mode integrates with existing frameworks — call getPolicyApprovedContext() before your LLM call, then auditLLMCall() after.
Design Principles
1. Performance First
- Low Latency: Every architecture decision optimized for single-digit ms policy evaluation
- Connection Pooling: Persistent connections to database
- Stateless Services: No session affinity required
- In-Memory Caching: Policies compiled and cached for fast evaluation
2. Security by Default
- In-VPC Deployment: Data never leaves customer infrastructure
- Encryption Everywhere: TLS 1.3 in transit, encryption at rest
- Least Privilege: Minimal IAM permissions
- Audit Logging: Complete trail of all operations
3. High Availability
- Multi-AZ Deployment: Services and database across availability zones
- Auto Scaling: Automatic capacity adjustment
- Health Checks: Continuous monitoring with auto-recovery
- Zero-Downtime Updates: Rolling deployments
4. Observable by Design
- CloudWatch Integration: Native AWS monitoring
- Structured Logging: JSON logs for easy parsing
- Custom Metrics: Policy latency, throughput, error rates
- Distributed Tracing: Request flow across services
Scalability
Horizontal Scaling
Agent Service:
- Scales from 1 to 50 tasks
- Auto-scaling based on CPU utilization (target: 70%)
- Scale-out: 60 seconds
- Scale-in: 300 seconds (5 minutes)
Orchestrator Service:
- Scales from 1 to 50 tasks
- Auto-scaling based on CPU utilization (target: 70%)
- Independent scaling from Agent service
Vertical Scaling
ECS Tasks:
- Configurable CPU: 256-4096 (0.25-4 vCPU)
- Configurable Memory: 512-30720 MB (0.5-30 GB)
RDS Database:
- Instance types: db.t3.medium to db.r5.xlarge
- Storage: Auto-scaling enabled
- Read replicas: Available for read-heavy workloads
Performance Characteristics
Latency
- Policy Evaluation: Single-digit ms P95
- Agent Execution: <30ms P95 (without external calls)
- MAP Coordination: <50ms P95 for 10-agent tasks
- Database Queries: <5ms P95
Throughput
Pilot Tier:
- 50,000 requests/month
- ~17 requests/minute average
- Burst: 100 requests/second
Growth Tier:
- 500,000 requests/month
- ~167 requests/minute average
- Burst: 500 requests/second
Enterprise Tier:
- Unlimited requests
- Auto-scaling to handle demand
- Burst: >1,000 requests/second
Resource Utilization
Default Configuration (5 agents + 10 orchestrators):
- Total vCPU: 15 (5 + 10)
- Total Memory: 30 GB (10 + 20)
- Database: db.t3.medium (2 vCPU, 4 GB RAM)
- Total Monthly Cost: ~$400-500 in AWS charges (excluding AxonFlow licensing)
Integration Points
LLM Providers
AxonFlow is model-agnostic — any model available through a supported provider works automatically.
| Provider | Community | Enterprise |
|---|---|---|
| OpenAI | ✅ | ✅ |
| Anthropic | ✅ | ✅ |
| Azure OpenAI | ✅ | ✅ |
| Google Gemini | ✅ | ✅ |
| Ollama (self-hosted) | ✅ | ✅ |
| AWS Bedrock | ❌ | ✅ |
AI Frameworks
- LangChain, LangGraph, LlamaIndex
- CrewAI, AutoGen, DSPy
- Semantic Kernel, Copilot Studio
- Custom integrations via SDK
MCP Connectors
- Databases: PostgreSQL, MySQL, MongoDB, Redis, Cassandra
- Storage: S3, GCS, Azure Blob
- Enterprise: Salesforce, Slack, Amadeus, Jira, ServiceNow, Snowflake
AWS Services
- Secrets Manager for credential storage
- CloudWatch for monitoring and logging
- Systems Manager for configuration
Security Architecture
For detailed security information, see Security Best Practices.
Key Security Features:
- In-VPC deployment (no internet exposure)
- Security groups with least-privilege rules
- IAM roles with minimal permissions
- Secrets Manager for credential storage
- Encryption at rest and in transit
- Complete audit logging
AWS Infrastructure
For enterprise deployments, AxonFlow runs entirely within your AWS VPC:
Key Points:
- All data stays within your VPC
- Multi-AZ deployment for high availability
- Private subnets for compute and database
- Secrets Manager for credential storage
Next Steps
- Infrastructure Details - CloudFormation resources
- Security Best Practices - Security configuration
- Features Overview - Platform capabilities