Architecture Overview

AxonFlow is an enterprise AI control plane that sits between your applications and AI infrastructure, providing low-latency policy enforcement, multi-agent orchestration, and permission-aware data access.

High-Level Architecture

Request Flow

When your application sends a request to AxonFlow:

Core Components

Agent Service

The Agent service is the primary entry point for policy enforcement and AI agent execution.

Key Features:

Single-digit ms P95 policy evaluation latency
TLS 1.3 encryption for all connections
Stateless architecture for horizontal scaling
Built-in health checks and auto-recovery

Specifications:

Default Count: 5 tasks (configurable 1-50)
CPU: 1 vCPU per task
Memory: 2 GB per task
Port: 8080
Protocol: HTTPS

Responsibilities:

Static policy evaluation (PII, SQLi, rate limits)
Request validation and authentication
MCP connector orchestration with SQLi response scanning
Code governance detection
Audit logging with decision chain tracking

Orchestrator Service

The Orchestrator service handles LLM routing, dynamic policies, and multi-agent coordination.

Key Features:

Multi-Agent Planning (MAP) for parallel execution
Pluggable LLM provider system (OpenAI, Anthropic, Azure, Gemini, Ollama, Bedrock)
Dynamic policy evaluation
Cost controls and budget management

Specifications:

Default Count: 10 tasks (configurable 1-50)
CPU: 1 vCPU per task
Memory: 2 GB per task
Port: 8081
Protocol: HTTP

Responsibilities:

LLM routing and failover
Dynamic policy enforcement
Cost tracking and budget enforcement
Execution replay and debugging
Workflow orchestration

Database Layer

PostgreSQL database for persistent storage and audit trails.

Specifications:

Engine: PostgreSQL 15.4
Deployment: Multi-AZ for high availability
Storage: 100 GB GP3 (expandable)
Encryption: At-rest encryption enabled
Backups: Automated daily backups (7-day retention)

Data Stored:

Policy definitions and versions
Audit logs and compliance trails
MCP connector configurations
Agent execution history
Performance metrics

Application Load Balancer

Internal ALB for routing and SSL termination.

Features:

TLS 1.3 encryption
Health checks with automatic failover
Connection draining for zero-downtime updates

Configuration:

Type: Application Load Balancer
Scheme: Internal (in-VPC only)
Protocol: HTTPS (443)
Target: Agent service (port 8080)

Feature Architecture

Policy Engine

AxonFlow uses a two-phase policy evaluation model for optimal performance and flexibility:

Phase 1 (Agent - Static Policies):

Evaluated synchronously in single-digit milliseconds
System-level rules: PII detection, SQL injection, rate limiting
No external context required — cached and compiled for speed

Phase 2 (Orchestrator - Dynamic Policies):

Evaluated after static policies pass
Tenant-specific and tier-aware rules
Can query external context (user roles, org settings)

Security Detection

PII Detection:

Critical PII (redacts by default): SSN, credit cards, Aadhaar, PAN, passport numbers
Non-critical PII (warns): Email, phone, names
Configurable actions: block, redact, warn, log via PII_ACTION env var
Regional patterns: US (SSN), India (Aadhaar, PAN), EU (national IDs)

SQL Injection Scanning:

Input scanning: Detects SQLi payloads in user queries before LLM processing
Response scanning: Scans MCP connector responses for SQLi in returned data
Modes: Basic (pattern-based), Advanced (enterprise, deeper analysis)
Default action: Block (high-confidence attacks), configurable via SQLI_ACTION env var

Code Governance:

Detects code artifacts in LLM responses (14+ languages)
Identifies: Language, code type, potential secrets, unsafe patterns (eval, shell injection)
Logs metadata for compliance audits

Governance Features

Decision Chain Tracking:

Records complete audit trail of AI decisions (EU AI Act Article 12 compliance)
Captures: Policy evaluations, LLM generations, data retrievals, human approvals
SHA-256 hash verification for tamper detection
Async persistence with configurable worker pool

Cost Controls:

Budget scopes: Organization, Team, Agent, Workflow, User
Budget periods: Daily, Weekly, Monthly, Quarterly, Yearly
Actions on exceed: Warn, Block, Downgrade (Enterprise)
Real-time usage tracking across all LLM providers

Execution Replay:

Replay specific requests with full decision chain visibility
Deterministic timeline for debugging governed workflows
Compliance-grade exports for audits

Integration Modes

AxonFlow supports two integration patterns:

Mode	Use Case	Flow
Proxy Mode	New projects, full governance	App → AxonFlow → LLM → AxonFlow → App
Gateway Mode	Existing stacks (LangChain, CrewAI)	Pre-check → Your LLM call → Audit

Proxy Mode routes all LLM traffic through AxonFlow for complete lifecycle control.

Gateway Mode integrates with existing frameworks — call getPolicyApprovedContext() before your LLM call, then auditLLMCall() after.

Design Principles

1. Performance First

Low Latency: Every architecture decision optimized for single-digit ms policy evaluation
Connection Pooling: Persistent connections to database
Stateless Services: No session affinity required
In-Memory Caching: Policies compiled and cached for fast evaluation

2. Security by Default

In-VPC Deployment: Data never leaves customer infrastructure
Encryption Everywhere: TLS 1.3 in transit, encryption at rest
Least Privilege: Minimal IAM permissions
Audit Logging: Complete trail of all operations

3. High Availability

Multi-AZ Deployment: Services and database across availability zones
Auto Scaling: Automatic capacity adjustment
Health Checks: Continuous monitoring with auto-recovery
Zero-Downtime Updates: Rolling deployments

4. Observable by Design

CloudWatch Integration: Native AWS monitoring
Structured Logging: JSON logs for easy parsing
Custom Metrics: Policy latency, throughput, error rates
Distributed Tracing: Request flow across services

Scalability

Horizontal Scaling

Agent Service:

Scales from 1 to 50 tasks
Auto-scaling based on CPU utilization (target: 70%)
Scale-out: 60 seconds
Scale-in: 300 seconds (5 minutes)

Orchestrator Service:

Scales from 1 to 50 tasks
Auto-scaling based on CPU utilization (target: 70%)
Independent scaling from Agent service

Vertical Scaling

ECS Tasks:

Configurable CPU: 256-4096 (0.25-4 vCPU)
Configurable Memory: 512-30720 MB (0.5-30 GB)

RDS Database:

Instance types: db.t3.medium to db.r5.xlarge
Storage: Auto-scaling enabled
Read replicas: Available for read-heavy workloads

Performance Characteristics

Latency

Policy Evaluation: Single-digit ms P95
Agent Execution: <30ms P95 (without external calls)
MAP Coordination: <50ms P95 for 10-agent tasks
Database Queries: <5ms P95

Throughput

Pilot Tier:

50,000 requests/month
~17 requests/minute average
Burst: 100 requests/second

Growth Tier:

500,000 requests/month
~167 requests/minute average
Burst: 500 requests/second

Enterprise Tier:

Unlimited requests
Auto-scaling to handle demand
Burst: >1,000 requests/second

Resource Utilization

Default Configuration (5 agents + 10 orchestrators):

Total vCPU: 15 (5 + 10)
Total Memory: 30 GB (10 + 20)
Database: db.t3.medium (2 vCPU, 4 GB RAM)
Total Monthly Cost: ~$400-500 in AWS charges (excluding AxonFlow licensing)

Integration Points

LLM Providers

AxonFlow is model-agnostic — any model available through a supported provider works automatically.

Provider	Community	Enterprise
OpenAI	✅	✅
Anthropic	✅	✅
Azure OpenAI	✅	✅
Google Gemini	✅	✅
Ollama (self-hosted)	✅	✅
AWS Bedrock	❌	✅

AI Frameworks

LangChain, LangGraph, LlamaIndex
CrewAI, AutoGen, DSPy
Semantic Kernel, Copilot Studio
Custom integrations via SDK

MCP Connectors

Databases: PostgreSQL, MySQL, MongoDB, Redis, Cassandra
Storage: S3, GCS, Azure Blob
Enterprise: Salesforce, Slack, Amadeus, Jira, ServiceNow, Snowflake

AWS Services

Secrets Manager for credential storage
CloudWatch for monitoring and logging
Systems Manager for configuration

Security Architecture

For detailed security information, see Security Best Practices.

Key Security Features:

In-VPC deployment (no internet exposure)
Security groups with least-privilege rules
IAM roles with minimal permissions
Secrets Manager for credential storage
Encryption at rest and in transit
Complete audit logging

AWS Infrastructure

For enterprise deployments, AxonFlow runs entirely within your AWS VPC:

Key Points:

All data stays within your VPC
Multi-AZ deployment for high availability
Private subnets for compute and database
Secrets Manager for credential storage

Next Steps

Infrastructure Details - CloudFormation resources
Security Best Practices - Security configuration
Features Overview - Platform capabilities

High-Level Architecture​

Request Flow​

Core Components​

Agent Service​

Orchestrator Service​

Database Layer​

Application Load Balancer​

Feature Architecture​

Policy Engine​

Security Detection​

Governance Features​

Integration Modes​

Design Principles​

1. Performance First​

2. Security by Default​

3. High Availability​

4. Observable by Design​

Scalability​

Horizontal Scaling​

Vertical Scaling​

Performance Characteristics​

Latency​

Throughput​

Resource Utilization​

Integration Points​

LLM Providers​

AI Frameworks​

MCP Connectors​

AWS Services​

Security Architecture​

AWS Infrastructure​

Next Steps​

High-Level Architecture

Request Flow

Core Components

Agent Service

Orchestrator Service

Database Layer

Application Load Balancer

Feature Architecture

Policy Engine

Security Detection

Governance Features

Integration Modes

Design Principles

1. Performance First

2. Security by Default

3. High Availability

4. Observable by Design

Scalability

Horizontal Scaling

Vertical Scaling

Performance Characteristics

Latency

Throughput

Resource Utilization

Integration Points

LLM Providers

AI Frameworks

MCP Connectors

AWS Services

Security Architecture

AWS Infrastructure

Next Steps