An open-source AI agent for enterprise compliance monitoring, regulatory intelligence, and GRC workflow automation across the GCC and MENA region.
- What is CIRA?
- Key Capabilities
- Architecture Overview
- Tech Stack
- Getting Started
- Usage Guide
- Project Structure
- Configuration Deep Dive
- Tools Reference
- Audit Trail System
- Compliance Frameworks
- Deployment
- Testing
- Development
- Roadmap
- Contributing
- License
- Disclaimer
CIRA (Compliance Intelligence & Risk Agent) is a production-grade, domain-specialized AI agent built for enterprise Governance, Risk & Compliance (GRC) functions. It combines a large language model reasoning core with a retrieval-augmented regulatory knowledge base to deliver continuous compliance monitoring, automated gap analysis, and audit-ready reporting — at a speed and scale that traditional manual processes cannot match.
CIRA is designed for the GCC and MENA regulatory environment but is architecturally generic enough to be adapted to any jurisdiction. It runs locally via Ollama or connects to cloud LLM providers (Azure OpenAI, OpenAI, Anthropic) with a single environment variable change.
- Domain-specialized — purpose-built prompts and tools for GRC workflows, not a generic chatbot wrapper.
- Source-cited — every finding is backed by a specific regulatory clause or document reference.
- Audit-ready — immutable, append-only audit logs for every agent decision with full provenance.
- Provider-agnostic — swap between Ollama (local/private) and cloud LLMs (Azure, OpenAI, Anthropic) without code changes.
- Extensible — add new compliance frameworks by dropping documents into a folder and running the ingestion script.
| Domain | What CIRA Does |
|---|---|
| Regulatory Compliance | Monitors regulatory obligations, detects changes, flags gaps, generates compliance briefs |
| Third-Party Risk | Scores vendor risk across configurable domains, tracks contract compliance, alerts on changes |
| ESG Reporting | Maps disclosures to GRI, TCFD, and SASB frameworks; identifies gaps pre-publication |
| HSE & Business Continuity | Monitors HSE policy adherence, tracks BCP test records, flags overdue reviews |
| Technical & Cyber Risk | IT risk register management, control gap analysis, security policy compliance |
| Finance & Project Compliance | Financial control monitoring, project budget compliance, procurement governance |
| Licensing & Software Governance | Software asset tracking, license compliance, renewal risk flagging |
| Audit Trail Generation | Immutable, source-cited audit logs for every agent decision and recommendation |
┌─────────────────────────────────────────────────────────────────┐
│ CIRA Agent Loop │
│ (LangGraph ReAct) │
│ │
│ User / System Input │
│ │ │
│ ▼ │
│ ┌─────────────────┐ ┌──────────────────────────────────┐ │
│ │ Ingestion & │───▶│ Regulatory Knowledge Base │ │
│ │ Context Engine │ │ (RAG · ChromaDB / pgvector) │ │
│ │ (PDF/DOCX/TXT/ │ │ Chunked embeddings with source │ │
│ │ CSV/JSON/MD) │ │ metadata and page references │ │
│ └─────────────────┘ └──────────────────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Compliance Reasoning Engine │ │
│ │ LLM Core (Ollama / Azure OpenAI / OpenAI / Anthropic) │ │
│ │ Chain-of-thought · Source citation · Confidence │ │
│ │ Tool-calling · Structured output │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ ┌─────────────────────────────────┐ │
│ │ Validation & │ │ Audit Trail Module │ │
│ │ Fact-Check │ │ (JSONL · Immutable · Per-day) │ │
│ └─────────────────┘ └─────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Delivery & Integration Layer │ │
│ │ REST API · Reports (JSON/TXT/DOCX) · Webhooks │ │
│ └──────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
| Component | Module | Description |
|---|---|---|
| Ingestion Engine | cira/knowledge/ingestor.py |
Parses policy documents, regulatory circulars, audit reports, contracts, and structured data (PDF, DOCX, TXT, CSV, JSON, Markdown). Uses RecursiveCharacterTextSplitter with configurable chunk size (default 1000) and overlap (default 200). |
| Regulatory Knowledge Base | cira/knowledge/vector_store.py |
A vector store of compliance frameworks and regulatory documents. Supports ChromaDB (local development) and pgvector (production). Documents are embedded with source metadata for citation. |
| Reasoning Engine | cira/graph.py |
A LangGraph-powered agentic loop using the ReAct pattern. The agent decomposes compliance obligations, invokes tools to retrieve evidence, runs gap analysis, assigns risk ratings, and generates source-cited findings. |
| Validation Module | Embedded in prompts | Cross-references all factual and regulatory claims against the knowledge base before output. Unverifiable claims are flagged with LOW confidence — never stated as fact. |
| Audit Trail Module | cira/tools/audit_logger.py |
Every agent action is logged with full provenance: timestamp (UTC), action type, detail, confidence score, input summary, regulatory reference, and metadata. Stored as append-only JSONL files (one per day). |
| Delivery Layer | cira/api/routes.py, cira/tools/report_gen.py |
Outputs findings as structured JSON (API), narrative text reports, DOCX documents, or real-time webhook alerts. REST API with OpenAPI/Swagger documentation. |
CIRA uses a ReAct (Reasoning + Acting) loop implemented with LangGraph:
┌──────────┐
│ │
│ START │
│ │
└────┬─────┘
│
▼
┌──────────┐ Has tool calls? ┌──────────┐
│ │─────── YES ──────────────▶│ │
│ Agent │ │ Tools │
│ (LLM) │◀──────────────────────────│ Execute │
│ │ │ │
└────┬─────┘ └──────────┘
│
│ No tool calls (final answer)
▼
┌──────────┐
│ │
│ END │
│ │
└──────────┘
- The Agent node receives the user query + system prompt and decides whether to invoke a tool or produce a final answer.
- If tool calls are present, the Tools node executes them (gap analysis search, document parsing, risk scoring, etc.) and returns results.
- The loop continues — the agent reasons over tool outputs, possibly calling additional tools — until it produces a final response with no tool calls.
- Every tool invocation is logged to the audit trail with provenance metadata.
| Layer | Technology | Purpose |
|---|---|---|
| Agent Framework | LangGraph 0.2+ | Stateful agent graph with ReAct loop |
| LLM Orchestration | LangChain 0.3+ | Tool binding, message handling, provider abstractions |
| LLM (Local) | Ollama | llama3.1, qwen2.5, mistral — runs fully offline |
| LLM (Cloud) | Azure OpenAI · OpenAI · Anthropic Claude | Configurable via single env var |
| Embeddings | Ollama (nomic-embed-text) · Azure OpenAI · OpenAI |
Document embedding for RAG |
| Vector Store | ChromaDB 0.5+ | Local development vector store |
| Vector Store (Prod) | pgvector | PostgreSQL-based production vector store |
| Document Parsing | PyMuPDF · python-docx · LangChain loaders | PDF, DOCX, TXT, CSV, JSON, Markdown |
| API Layer | FastAPI 0.115+ | REST API with OpenAPI docs |
| Validation | Pydantic 2.0+ | Request/response schemas and config |
| Configuration | Pydantic Settings | Type-safe env var management |
| Logging | structlog | JSON-structured logging |
| HTTP | httpx | Webhook delivery |
| Auth | API Key header (X-API-Key) |
Development authentication |
| Containerization | Docker + Docker Compose | Production deployment |
| Requirement | Version | Notes |
|---|---|---|
| Python | 3.11+ | Required |
| Ollama | Latest | Required for local LLM mode |
| Git | Any | Required |
| Docker | Latest | Optional — for containerized deployment |
git clone https://github.com/mabualzait/cira-agent.git
cd cira-agentpython -m venv .venv
source .venv/bin/activate # macOS/Linux
# .venv\Scripts\activate # Windowspip install -r requirements.txtOr using the Makefile:
make install# Recommended for compliance reasoning tasks
ollama pull llama3.1
# Recommended for embeddings
ollama pull nomic-embed-text
# Alternative: stronger tool-calling model
ollama pull qwen2.5:14bcp .env.example .envEdit .env with your preferred settings. At minimum, the defaults work with Ollama running locally. See Configuration Reference for all options.
The .env file controls all application behavior. The minimal configuration for local development:
LLM_PROVIDER=ollama
OLLAMA_MODEL=llama3.1
OLLAMA_BASE_URL=http://localhost:11434
VECTOR_STORE=chromaFor Azure OpenAI:
LLM_PROVIDER=azure_openai
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=your-key-here
AZURE_OPENAI_DEPLOYMENT=gpt-4o
AZURE_OPENAI_API_VERSION=2024-02-01For OpenAI:
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4oFor Anthropic:
LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-3-5-sonnet-20241022Drop your regulatory documents (PDF, DOCX, TXT, CSV, JSON, Markdown) into ./data/regulatory_docs/, then ingest:
python scripts/ingest.pyWith custom options:
python scripts/ingest.py \
--dir ./path/to/custom/docs \
--collection my_collection \
--chunk-size 1500 \
--chunk-overlap 300The ingestion script will:
- Recursively scan the directory for supported file types.
- Parse each file and extract text content.
- Split content into chunks using
RecursiveCharacterTextSplitter. - Embed chunks and store them in the configured vector store.
- Enrich each chunk with source metadata (filename, page number, file type).
# Start the FastAPI server (with auto-reload)
python main.py
# Or run the interactive CLI
python cli.py
# Or run a single query
python cli.py --query "Identify gaps in our third-party risk policy against ISO 31000"Using the Makefile:
make run # Start API server
make cli # Interactive CLI
make ingest # Run ingestion pipelineThe CLI provides a REPL interface for compliance queries:
python cli.py _____ _____ _____
/ ____|_ _| __ \ /\
| | | | | |__) | / \
| | | | | _ / / /\ \
| |____ _| |_| | \ \ / ____ \
\_____|_____|_| \_\/_/ \_\
Compliance Intelligence & Risk Agent
Type 'help' for commands, 'exit' to quit.
cira> Identify gaps in our information security policy against ISO 27001 Annex A
CLI commands:
| Command | Description |
|---|---|
help |
Show available commands |
exit / quit |
Exit the CLI |
clear |
Clear the screen |
| Any text | Submit as a compliance query |
Single query mode:
python cli.py --query "What are the key data privacy requirements under UAE PDPL?"The CIRAAgent class provides a programmatic interface:
from cira import CIRAAgent
agent = CIRAAgent()result = agent.run(
task="gap_analysis",
input={
"document": "path/to/policy.pdf",
"framework": "ISO_27001",
"scope": "information_security"
}
)
print(result.findings) # Detailed gap analysis with citations
print(result.success) # True if findings were generated
print(result.report) # Narrative report text
print(result.raw_messages) # Full LangGraph message historyresult = agent.run(
task="vendor_risk_score",
input={
"vendor_name": "Acme Cloud Services",
"contract_path": "contracts/acme_2024.pdf",
"risk_domains": ["data_privacy", "financial_stability", "operational", "cyber"]
}
)
print(result.findings) # Risk assessment with per-domain analysisanswer = agent.query("What are the key requirements of ISO 27001 Section A.8?")
print(answer)All agent.run() calls return a CIRAResult with the following fields:
| Field | Type | Description |
|---|---|---|
findings |
str |
The agent's analysis output with citations |
risk_ratings |
dict[str, str] |
Risk ratings per finding (CRITICAL/HIGH/MEDIUM/LOW) |
audit_trail |
list[dict] |
Audit events generated during this run |
report |
str |
Narrative summary (same as findings by default) |
raw_messages |
list |
Full LangGraph message history for debugging |
success |
bool (property) |
True if findings are non-empty |
Start the API server:
python main.pyThe server starts at http://localhost:8000 with auto-generated docs at:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
All endpoints except /health require the X-API-Key header.
curl http://localhost:8000/healthResponse:
{
"status": "healthy",
"version": "0.1.0",
"llm_provider": "ollama",
"vector_store": "chroma",
"timestamp": "2026-03-05T06:36:00Z"
}curl -X POST http://localhost:8000/api/v1/analyze \
-H "Content-Type: application/json" \
-H "X-API-Key: your-api-key" \
-d '{
"task": "regulatory_gap_analysis",
"document_path": "data/uploads/policy.pdf",
"framework": "NESA_IA",
"scope": "information_security",
"output_format": "json"
}'Request body (AnalyzeRequest):
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
task |
string |
✅ | — | Task type: regulatory_gap_analysis, vendor_risk_score, query |
document_path |
string |
❌ | null |
Path to document to analyze |
framework |
string |
❌ | ISO_27001 |
Target compliance framework |
scope |
string |
❌ | general |
Compliance domain scope |
query |
string |
❌ | null |
Natural-language query (for task=query) |
vendor_name |
string |
❌ | null |
Vendor name (for risk scoring tasks) |
risk_domains |
string[] |
❌ | null |
Risk domains to evaluate |
output_format |
string |
❌ | json |
Output: json, text, or docx |
Response body (AnalyzeResponse):
{
"task": "regulatory_gap_analysis",
"status": "completed",
"findings": "## Gap Analysis Results\n...",
"report_path": null,
"audit_trail_count": 0,
"timestamp": "2026-03-05T06:36:00Z"
}curl -X POST http://localhost:8000/api/v1/ingest \
-H "Content-Type: application/json" \
-H "X-API-Key: your-api-key" \
-d '{
"directory": "./data/regulatory_docs",
"collection_name": "regulatory_docs",
"chunk_size": 1000,
"chunk_overlap": 200
}'Response:
{
"status": "completed",
"chunks_ingested": 342,
"collection_name": "regulatory_docs",
"timestamp": "2026-03-05T06:36:00Z"
}curl "http://localhost:8000/api/v1/audit-trail?date=2026-03-05&limit=50" \
-H "X-API-Key: your-api-key"Query parameters:
| Param | Type | Default | Description |
|---|---|---|---|
date |
string |
Today (UTC) | Date in YYYY-MM-DD format |
action |
string |
null |
Filter by action type |
limit |
int |
100 |
Maximum events to return |
Response:
{
"date": "2026-03-05",
"events": [
{
"timestamp": "2026-03-05T06:30:00+00:00",
"action": "gap_analysis",
"detail": "Retrieved 8 passages for ISO_27001/information_security",
"confidence": "HIGH",
"input_summary": null,
"regulatory_ref": null,
"metadata": {}
}
],
"total": 1
}cira-agent/
├── main.py # FastAPI application entry point
├── cli.py # Interactive CLI with REPL and --query mode
├── cira/ # Core Python package
│ ├── __init__.py # Package root — exports CIRAAgent, __version__
│ ├── config.py # Pydantic Settings — all configuration
│ ├── agent.py # CIRAAgent class — public SDK interface
│ ├── graph.py # LangGraph ReAct agent graph
│ ├── logging.py # Structured logging (structlog) setup
│ ├── llm/ # LLM provider abstraction
│ │ ├── __init__.py
│ │ ├── provider.py # Factory: get_chat_model(), get_embeddings()
│ │ └── prompts.py # All prompt templates (system, gap, risk, etc.)
│ ├── knowledge/ # Knowledge base & vector store
│ │ ├── __init__.py
│ │ ├── vector_store.py # ChromaDB / pgvector abstraction
│ │ ├── ingestor.py # Document ingestion pipeline
│ │ └── frameworks/ # Built-in framework definition files
│ │ └── .gitkeep
│ ├── tools/ # LangGraph tools (agent capabilities)
│ │ ├── __init__.py # Exports ALL_TOOLS list
│ │ ├── gap_analysis.py # @tool — compliance gap analysis
│ │ ├── risk_scorer.py # @tool — vendor risk scoring
│ │ ├── doc_parser.py # @tool — document parsing
│ │ ├── audit_logger.py # Audit trail (not a @tool — called internally)
│ │ ├── alert_engine.py # @tool — compliance alerting + webhooks
│ │ └── report_gen.py # @tool — report generation (JSON/TXT/DOCX)
│ └── api/ # REST API layer
│ ├── __init__.py
│ ├── routes.py # FastAPI route definitions + auth
│ └── schemas.py # Pydantic request/response schemas
├── data/ # Runtime data directories
│ ├── regulatory_docs/ # Drop your documents here for ingestion
│ │ └── .gitkeep
│ ├── chroma/ # ChromaDB persistent store (git-ignored)
│ └── uploads/ # Temporary upload directory
│ └── .gitkeep
├── scripts/ # Utility scripts
│ ├── ingest.py # Knowledge base ingestion CLI
│ └── benchmark.py # Performance benchmarking
├── tests/ # Test suite
│ ├── __init__.py
│ ├── test_agent.py # Agent + CIRAResult unit tests
│ ├── test_tools.py # Tool unit tests (mocked vector store)
│ └── test_api.py # API integration tests (FastAPI TestClient)
├── .env.example # Environment variable template (documented)
├── .gitignore # Git ignore rules
├── requirements.txt # Python dependencies (pinned ranges)
├── pyproject.toml # Project metadata, pytest/ruff/mypy config
├── Dockerfile # Production container image
├── docker-compose.yml # Multi-service orchestration (app + Ollama)
├── Makefile # Developer commands (install/run/test/lint)
├── LICENSE # MIT License
└── README.md # This file
The single source of truth for all application settings. Uses Pydantic Settings to load and validate environment variables with type safety.
from cira.config import settings
# Access any setting
print(settings.llm_provider) # LLMProvider.OLLAMA
print(settings.ollama_model) # "llama3.1"
print(settings.vector_store) # VectorStoreBackend.CHROMA
print(settings.api_port) # 8000
# Ensure runtime directories exist
settings.ensure_directories()Key design decisions:
- Enum types (
LLMProvider,EmbeddingProvider,VectorStoreBackend) prevent typos and enable IDE autocomplete. SettingsConfigDictwithcase_sensitive=Falseandextra="ignore"for flexible env var handling.- A module-level
settingssingleton is instantiated on import — shared across the entire app.
Two factory functions with lazy imports — only the selected provider's SDK is loaded at runtime:
from cira.llm.provider import get_chat_model, get_embeddings
llm = get_chat_model() # Returns BaseChatModel
embeddings = get_embeddings() # Returns EmbeddingsProvider selection is automatic based on LLM_PROVIDER and EMBEDDING_PROVIDER env vars.
Builds and compiles the LangGraph agent with:
- A system prompt (
cira/llm/prompts.py) that defines CIRA's compliance reasoning rules. - All tools from
cira/tools/bound to the LLM viallm.bind_tools(). - Conditional routing: tool calls → tool execution → back to agent, or final answer → end.
The CIRAAgent class wraps the graph and provides:
run(task, input)→CIRAResult— structured task executionquery(question)→str— simple question answering- Automatic task-to-message translation for different task types
- Audit trail logging for task start/completion
CIRA supports four LLM providers. Set LLM_PROVIDER in your .env:
| Provider | Value | Requires | Best For |
|---|---|---|---|
| Ollama | ollama |
Ollama running locally | Privacy, offline use, development |
| Azure OpenAI | azure_openai |
Azure subscription + deployment | Enterprise with Azure |
| OpenAI | openai |
OpenAI API key | Quick start, strong reasoning |
| Anthropic | anthropic |
Anthropic API key | Long-context analysis |
Set EMBEDDING_PROVIDER independently of the LLM provider:
| Provider | Value | Default Model |
|---|---|---|
| Ollama | ollama |
nomic-embed-text |
| Azure OpenAI | azure_openai |
(set via AZURE_OPENAI_EMBEDDING_DEPLOYMENT) |
| OpenAI | openai |
text-embedding-3-small |
| Backend | Value | Use Case |
|---|---|---|
| ChromaDB | chroma |
Local development, small-medium datasets |
| pgvector | pgvector |
Production, large datasets, concurrent access |
| Variable | Type | Default | Description |
|---|---|---|---|
LLM_PROVIDER |
enum | ollama |
LLM backend: ollama, azure_openai, openai, anthropic |
OLLAMA_MODEL |
string | llama3.1 |
Ollama model name |
OLLAMA_BASE_URL |
string | http://localhost:11434 |
Ollama API endpoint |
AZURE_OPENAI_ENDPOINT |
string | — | Azure OpenAI resource URL |
AZURE_OPENAI_API_KEY |
string | — | Azure OpenAI API key |
AZURE_OPENAI_DEPLOYMENT |
string | — | Azure OpenAI deployment name |
AZURE_OPENAI_API_VERSION |
string | 2024-02-01 |
Azure API version |
OPENAI_API_KEY |
string | — | OpenAI API key |
OPENAI_MODEL |
string | gpt-4o |
OpenAI model name |
ANTHROPIC_API_KEY |
string | — | Anthropic API key |
ANTHROPIC_MODEL |
string | claude-3-5-sonnet-20241022 |
Anthropic model name |
EMBEDDING_PROVIDER |
enum | ollama |
Embedding backend: ollama, azure_openai, openai |
OLLAMA_EMBEDDING_MODEL |
string | nomic-embed-text |
Ollama embedding model |
OPENAI_EMBEDDING_MODEL |
string | text-embedding-3-small |
OpenAI embedding model |
VECTOR_STORE |
enum | chroma |
Vector store: chroma, pgvector |
CHROMA_PERSIST_DIR |
path | ./data/chroma |
ChromaDB storage directory |
PGVECTOR_CONNECTION_STRING |
string | — | PostgreSQL connection string for pgvector |
KNOWLEDGE_BASE_DIR |
path | ./data/regulatory_docs |
Directory for document ingestion |
API_HOST |
string | 0.0.0.0 |
API server bind address |
API_PORT |
int | 8000 |
API server port |
API_KEY |
string | changeme-in-production |
API authentication key |
LOG_LEVEL |
enum | INFO |
Logging level: DEBUG, INFO, WARNING, ERROR, CRITICAL |
LOG_DIR |
path | ./logs |
Log file directory |
AUDIT_TRAIL_DIR |
path | ./data/audit_trail |
Audit trail storage directory |
CIRA's agent has access to the following tools, each implemented as a LangChain @tool:
Module: cira/tools/gap_analysis.py
Searches the regulatory knowledge base for compliance framework requirements relevant to a query.
| Parameter | Type | Default | Description |
|---|---|---|---|
query |
str |
(required) | The compliance question or policy description |
framework |
str |
ISO_27001 |
Target compliance framework |
scope |
str |
general |
Compliance domain scope |
Returns formatted regulatory excerpts with source citations (filename, page number).
Module: cira/tools/risk_scorer.py
Retrieves compliance information for multi-domain vendor risk assessment.
| Parameter | Type | Default | Description |
|---|---|---|---|
vendor_name |
str |
(required) | Name of the vendor |
risk_domains |
str |
data_privacy,operational,cyber |
Comma-separated risk domains |
context |
str |
"" |
Additional vendor context |
Default risk domain weights:
| Domain | Weight |
|---|---|
data_privacy |
25% |
cyber |
25% |
operational |
20% |
financial_stability |
15% |
contractual |
15% |
Module: cira/tools/doc_parser.py
Parses a document file and returns extracted text. Supported formats: PDF, DOCX, TXT, CSV, JSON, Markdown. Content is truncated at 50,000 characters for LLM context window safety.
Module: cira/tools/alert_engine.py
Sends a compliance alert with optional webhook delivery.
| Parameter | Type | Default | Description |
|---|---|---|---|
title |
str |
(required) | Alert title |
severity |
str |
(required) | CRITICAL / HIGH / MEDIUM / LOW |
description |
str |
(required) | Alert details |
framework |
str |
"" |
Related framework |
webhook_url |
str |
"" |
URL for HTTP POST delivery |
Module: cira/tools/report_gen.py
Generates exportable compliance reports in JSON, plain text, or DOCX format. Reports are saved to ./data/reports/ with timestamped filenames.
CIRA maintains an immutable, append-only audit trail for every agent action. This is critical for GRC compliance — every recommendation, finding, and decision has full provenance.
Audit events are stored as JSONL (one JSON object per line) in daily files:
data/audit_trail/
├── audit_2026-03-04.jsonl
├── audit_2026-03-05.jsonl
└── ...
{
"timestamp": "2026-03-05T06:30:00+00:00",
"action": "gap_analysis",
"detail": "Retrieved 8 passages for ISO_27001/information_security",
"confidence": "HIGH",
"input_summary": "Perform a compliance gap analysis...",
"regulatory_ref": "ISO 27001:2022 Annex A.8",
"metadata": {}
}| Field | Description |
|---|---|
timestamp |
UTC ISO 8601 timestamp |
action |
Action type (e.g., gap_analysis, vendor_risk_assessment, alert_sent, report_generated, task_started:*, task_completed:*) |
detail |
Human-readable description of what happened |
confidence |
Agent confidence level: HIGH, MEDIUM, LOW |
input_summary |
Summary of the input that triggered this action |
regulatory_ref |
Specific regulatory clause or framework reference |
metadata |
Additional structured data |
Audit trails can be exported as formatted JSON:
from cira.tools.audit_logger import export_audit_trail
export_audit_trail(date="2026-03-05", output_path="./exports/audit.json")Or retrieved programmatically:
from cira.tools.audit_logger import get_audit_trail
events = get_audit_trail(date="2026-03-05", action_filter="gap_analysis", limit=50)CIRA supports any compliance framework through its RAG knowledge base. The following are reference frameworks the system is optimized for:
| Category | Frameworks |
|---|---|
| Information Security | ISO/IEC 27001:2022, NESA UAE Information Assurance Standards |
| Risk Management | ISO 31000:2018, COBIT 2019 |
| Business Continuity | ISO 22301:2019 |
| ESG & Sustainability | GRI Standards 2021, TCFD, SASB |
| Data Privacy | UAE PDPL, Saudi PDPL, GDPR (reference) |
| Quality Management | ISO 9001:2015 |
| HSE | ISO 45001:2018 |
- Place the framework documents (PDF, DOCX, TXT) in
./data/regulatory_docs/ - Run the ingestion pipeline:
python scripts/ingest.py
- The documents are automatically chunked, embedded, and indexed.
- The agent will now include them in its knowledge retrieval.
You can organize documents into subdirectories — the ingestion script scans recursively:
data/regulatory_docs/
├── iso_27001/
│ ├── ISO_27001_2022_full.pdf
│ └── annex_a_controls.pdf
├── uae_pdpl/
│ └── UAE_PDPL_2021.pdf
└── internal/
├── company_security_policy.docx
└── vendor_risk_procedures.pdf
Build and run a standalone container:
docker build -t cira-agent .
docker run -p 8000:8000 --env-file .env cira-agentThe Dockerfile uses python:3.11-slim and includes system dependencies for document parsing (poppler, tesseract).
The included docker-compose.yml runs CIRA with Ollama:
# Start all services
docker compose up -d
# View logs
docker compose logs -f cira
# Stop
docker compose downServices:
| Service | Description | Port |
|---|---|---|
cira |
CIRA API server | 8000 |
ollama |
Local LLM server | 11434 |
For production with pgvector, uncomment the postgres service in docker-compose.yml and set:
VECTOR_STORE=pgvector
PGVECTOR_CONNECTION_STRING=postgresql://cira:changeme@postgres:5432/cira- API Key: Change
API_KEYfrom the default to a secure random string. - CORS: Restrict
allow_originsinmain.pyfrom["*"]to your specific domains. - HTTPS: Deploy behind a reverse proxy (nginx, Caddy) with TLS termination.
- Vector Store: Switch from ChromaDB to pgvector for concurrent access and durability.
- Logging: Set
LOG_LEVEL=WARNINGin production to reduce noise. - Resources: LLM inference (especially local Ollama) is memory-intensive. Allocate at least 8GB RAM for 7B parameter models, 16GB+ for 14B models.
Run the full test suite:
pytest tests/ -vOr using the Makefile:
make test # Run tests
make test-cov # Run tests with coverage report| File | What It Tests | Strategy |
|---|---|---|
tests/test_agent.py |
CIRAAgent, CIRAResult |
Unit tests with mocked graph |
tests/test_tools.py |
Audit logger, gap analysis tool | Unit tests with mocked vector store |
tests/test_api.py |
FastAPI endpoints | Integration tests with TestClient |
Tests use unittest.mock.patch to isolate dependencies (LLM, vector store) so they run without an Ollama instance or real API keys.
make install # Install dependencies
make dev # Install deps + dev tools (ruff, mypy)
make run # Start API server
make cli # Interactive CLI
make ingest # Run document ingestion
make test # Run test suite
make test-cov # Tests with HTML coverage report
make lint # Lint with ruff
make format # Auto-format with ruff
make typecheck # Type-check with mypy
make clean # Remove __pycache__, .pytest_cache, etc.
make docker-build # Build Docker image
make docker-up # Start Docker Compose stack
make docker-down # Stop Docker Compose stackThe project is configured with:
- Ruff — linting and formatting (configured in
pyproject.toml) - mypy — static type checking with
disallow_untyped_defs - pytest — testing with async support
- Core LangGraph agent loop (ReAct pattern)
- Ollama + Azure OpenAI + OpenAI + Anthropic provider support
- ChromaDB vector store with document ingestion pipeline
- Compliance gap analysis tool
- Third-party risk scoring tool
- Immutable audit trail module (JSONL)
- FastAPI REST API with OpenAPI docs
- Report generation (JSON, TXT, DOCX)
- Real-time alerting with webhook delivery
- Docker + Docker Compose deployment
- Structured logging with structlog
- Pydantic Settings-based configuration
- pgvector production backend
- ServiceNow GRC integration
- Power BI connector
- Web UI dashboard (React)
- Automated regulatory change monitoring (scraper + alerting)
- Multi-tenant support with RBAC
- Kubernetes deployment manifests
- Arabic language regulatory document support
- File upload API endpoint
- Streaming responses (SSE)
- Rate limiting and usage tracking
Contributions are welcome. Please open an issue to discuss what you'd like to change before submitting a pull request.
- Fork the repository
- Create a feature branch (
git checkout -b feature/your-feature) - Commit your changes (
git commit -m 'Add your feature') - Push to the branch (
git push origin feature/your-feature) - Open a Pull Request
- Follow the existing code style (enforced by
ruff). - Add type hints to all function signatures.
- Include tests for new functionality.
- Update the README if you add new tools, endpoints, or configuration options.
- Keep prompts in
cira/llm/prompts.py— do not hardcode prompts in tool or agent code.
MIT License — see LICENSE for details.
CIRA is a strategic advisory and monitoring tool. All outputs are informational in nature and do not constitute legal, regulatory, or professional compliance advice. Always engage qualified legal or compliance professionals for formal regulatory obligations. The confidence scores and risk ratings provided are AI-generated assessments and should be validated by domain experts before being used in formal compliance processes.