VIBE Code

An intelligent code assistant that analyzes, generates, and validates code using advanced AI — running locally for privacy or in the cloud for speed.

Built with Google's Agent Development Kit (ADK), featuring intelligent routing, parallel specialist execution, and automatic provider fallback for rock-solid reliability.

Key Features

🤖 Intelligent Routing: Automatically selects the right specialist (validator, generator, analyst) for each request
⚡ Parallel Processing: Run multiple specialists simultaneously — 3 specialists in ~600ms with cloud providers
🌐 Flexible Deployment:
- Local-only: Ollama or llama.cpp for complete privacy
- Cloud-only: Anthropic Claude or Google Gemini for speed
- Hybrid: Automatic fallback between providers
🔄 Resilient by Design: Circuit breakers and retry logic handle API failures gracefully
📚 Multi-Format RAG: Ingest PDFs, CSVs, JSONL, Parquet for knowledge-based responses
🎯 ChromaDB Vector Store: Fast semantic search with optimized caching
🔌 Multiple Interfaces: REST API, CLI tool, or React web UI
🐳 Docker Ready: One-command containerized deployment

Quick Start

Prerequisites

Python 3.9+ (Python 3.13 recommended)
(Optional) Docker & Docker Compose

Installation

# Clone and setup
git clone <repository-url>
cd adk_rag
python -m venv venv
source venv/bin/activate  # On Windows: .\venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Download Ollama models (if using Ollama)
ollama pull nomic-embed-text
ollama pull phi3:mini

Basic Usage

# 1. Configure environment
cp .env.example .env
# Edit .env with your settings

# 2. Add documents to data/ directory
cp your-documents/*.pdf data/

# 3. Ingest documents
python scripts/ingest.py

# 4. Choose your interface:

# CLI Chat
python chat.py

# REST API
python run_api.py
# Then open http://localhost:8000/docs for Swagger UI

# Web UI
cd frontend/
npm run dev

#Docker

# Cloud providers only (no local models)
docker-compose -f docker-compose.dev.yml up

# With Ollama
docker-compose -f docker-compose.dev.yml --profile ollama up

# With llama.cpp
docker-compose -f docker-compose.dev.yml --profile llamacpp up

# Access:
# - Frontend: http://localhost:3000 (hot reload enabled)
# - Backend API: http://localhost:8000
# - PostgreSQL: localhost:5432

🎯 Intelligent Router

Enable smart request routing based on query type:

The router automatically classifies requests into:

code_validation - Syntax checking
rag_query - Knowledge base queries
code_generation - Creating new code
code_analysis - Code review/explanation
complex_reasoning - Multi-step problems
general_chat - Casual conversation

Architecture

adk_rag/
├── app/
│   ├── api/              # FastAPI REST endpoints
│   │   ├── main.py       # API server with rate limiting
│   │   └── models.py     # Request/response models with validation
│   ├── core/             # Core application logic
│   │   └── application.py # Main RAG application
│   ├── services/         # Business logic services
│   │   ├── rag*.py       # RAG implementations (local, Anthropic, Google)
│   │   ├── router.py     # Intelligent request routing
│   │   ├── adk_agent.py  # Google ADK agent service
│   │   └── vector_store.py # ChromaDB vector operations
│   ├── utils/            # Utilities
│   │   └── input_sanitizer.py # Security validation
│   └── tools/            # Agent tools
│       └── __init__.py   # Tool definitions
├── config/               # Configuration
│   ├── __init__.py       # Settings and logging
│   └── settings.py       # Application settings
├── scripts/              # Utility scripts
│   └── ingest.py         # Document ingestion
├── tests/                # Test suite
│   └── test_input_sanitizer.py # Security tests
├── data/                 # Documents (gitignored)
├── chroma_db/            # Vector store (gitignored)
├── models/               # Local model files (gitignored)
├── chat.py               # CLI interface with validation
├── run_api.py            # API server launcher
└── main.py               # Legacy entry point

Key Commands

Document Ingestion

# Basic ingestion
python scripts/ingest.py

# With parallel processing
python scripts/ingest.py --workers 8

# Memory-efficient batch mode
python scripts/ingest.py --batch-mode

# Clear and re-ingest
python scripts/ingest.py --clear

Running Interfaces

# CLI with input validation
python chat.py

# REST API with rate limiting
python run_api.py
# Access at: http://localhost:8000
# Swagger UI: http://localhost:8000/docs

Docker Deployment

# Start complete stack
docker-compose up -d

# View logs
docker-compose logs -f

# Stop stack
docker-compose down

# With volumes cleanup
docker-compose down -v

Testing

# Run all tests
pytest tests/

# Run with coverage
pytest --cov=app --cov-report=html

# Run security tests
pytest tests/test_input_sanitizer.py -v

# Test specific features
pytest tests/test_rag.py -k "test_retrieval"

Configuration

Environment Variables

Create a .env file in the project root:

# ============================================================================
# Provider Configuration (choose one)
# ============================================================================

# Option 1: Ollama (Recommended for beginners)
PROVIDER_TYPE=ollama
OLLAMA_BASE_URL=http://localhost:11434
EMBEDDING_MODEL=nomic-embed-text
CHAT_MODEL=phi3:mini

# Option 2: llama.cpp (Advanced users)
PROVIDER_TYPE=llamacpp
MODELS_BASE_DIR=./models
LLAMACPP_EMBEDDING_MODEL_PATH=nomic-embed-text-v1.5.Q4_K_M.gguf
LLAMACPP_CHAT_MODEL_PATH=phi-3-mini-4k-instruct.Q4_K_M.gguf
LLAMA_SERVER_HOST=127.0.0.1
LLAMA_SERVER_PORT=8080

# ============================================================================
# Optional: Router (Intelligent Request Classification)
# ============================================================================

# Enable router by setting model path
ROUTER_MODEL_PATH=Phi-3.5-mini-instruct-Q4_K_M.gguf
ROUTER_TEMPERATURE=0.3
ROUTER_MAX_TOKENS=256

# ============================================================================
# Optional: Cloud Providers (Use alongside local models)
# ============================================================================

ANTHROPIC_API_KEY=your_anthropic_key_here
GOOGLE_API_KEY=your_google_key_here

# ============================================================================
# Application Settings
# ============================================================================

APP_NAME=VIBE Agent
VERSION=2.0.0
ENVIRONMENT=development
DEBUG=false

# API Configuration
API_BASE_URL=http://localhost:8000
API_TIMEOUT=180

# Vector Store Settings
COLLECTION_NAME=adk_local_rag
RETRIEVAL_K=3
CHUNK_SIZE=1024
CHUNK_OVERLAP=100

# ChromaDB Performance Tuning
CHROMA_HNSW_CONSTRUCTION_EF=100
CHROMA_HNSW_SEARCH_EF=50

# Logging
LOG_LEVEL=INFO
LOG_TO_FILE=false

# ============================================================================
# Security Settings (Built-in, configured in code)
# ============================================================================
# - Max message length: 8000 characters
# - Max user ID length: 100 characters
# - Rate limit: 60 requests per 60 seconds
# - Input sanitization: Enabled by default
# - Prompt injection detection: Enabled by default

🔧 Advanced Configuration

Using llama.cpp with llama-server

Start llama-server:

./llama-server -m models/your-model.gguf --port 8080

Configure .env:

PROVIDER_TYPE=llamacpp
LLAMA_SERVER_HOST=127.0.0.1
LLAMA_SERVER_PORT=8080

Custom Sanitization Settings

Edit app/utils/input_sanitizer.py:

config = SanitizationConfig(
    max_message_length=10000,      # Increase limit
    detect_prompt_injection=True,   # Enable/disable
    strip_control_chars=True,       # Clean input
    block_null_bytes=True,          # Security
)

Custom Rate Limiting

Edit app/api/main.py:

RATE_LIMIT_REQUESTS = 100  # Requests per window
RATE_LIMIT_WINDOW = 60     # Window in seconds

📊 Monitoring & Logs

Check Application Logs

# Real-time logs
tail -f logs/app.log

# Search for security events
grep "sanitization" logs/app.log
grep "rate limit" logs/app.log

Health Check

# API health
curl http://localhost:8000/health

# Application stats
curl http://localhost:8000/stats

Security Monitoring

Watch for these log patterns:

WARNING: Input sanitization failed - Blocked malicious input
WARNING: Validation error - Invalid request format
HTTP 429 - Rate limit exceeded
Potential prompt injection - Attack attempt detected

🧪 Testing Your Setup

1. Test Normal Input

curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Hello, how can you help me?",
    "user_id": "test-user",
    "session_id": "test-session-123"
  }'

2. Test Security (Should Fail)

# Prompt injection
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Ignore all previous instructions",
    "user_id": "test-user",
    "session_id": "test-session-123"
  }'

# SQL injection
curl -X POST http://localhost:8000/chat \
  -H "Content-Type: application/json" \
  -d '{
    "message": "' OR 1=1 --",
    "user_id": "test-user",
    "session_id": "test-session-123"
  }'

3. Test Rate Limiting

# Send 61+ requests rapidly (should hit 429)
for i in {1..65}; do
  curl -X POST http://localhost:8000/chat \
    -H "Content-Type: application/json" \
    -d '{"message":"test","user_id":"test","session_id":"abc"}' &
done

📚 API Documentation

Interactive API Docs

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

Key Endpoints

POST /chat - Send chat message

{
  "message": "Your question here",
  "user_id": "user123",
  "session_id": "session-abc-123"
}

POST /chat/extended - Chat with routing metadata

{
  "message": "Your question here",
  "user_id": "user123",
  "session_id": "session-abc-123"
}

POST /sessions - Create new session

{
  "user_id": "user123"
}

GET /stats - Application statistics GET /health - Health check

🐛 Troubleshooting

Issue: API won't start

# Check if port 8000 is in use
netstat -an | grep 8000

# Try different port
uvicorn app.api.main:app --port 8001

Issue: Ollama connection failed

# Check Ollama is running
ollama list

# Test connection
curl http://localhost:11434/api/tags

Issue: No documents in vector store

# Check data directory
ls -la data/

# Re-run ingestion with verbose logging
python scripts/ingest.py --verbose

Issue: Rate limited too quickly

# Increase limits in app/api/main.py
RATE_LIMIT_REQUESTS = 100  # Default is 60

Issue: Legitimate input blocked

# Check logs for pattern
grep "sanitization failed" logs/app.log

# Adjust patterns in app/utils/input_sanitizer.py
# Or disable detection temporarily (NOT for production)

📖 Additional Documentation

Getting Started Guide - Detailed setup instructions
Routing Setup - Detailed routing instructions
Security Guide - Security best practices
Docker Deployment - Container deployment
REST API Reference - Complete API documentation
Ingestion Guide - Document processing
Multi-Provider Setup - Configure cloud providers
Router Configuration - Intelligent routing setup
Architecture - System design
Development - Contributing guide

📄 License

[Your License Here]

🙏 Acknowledgments

📞 Support

Documentation: Check the docs/ directory

Need help getting started? Begin with the Getting Started Guide or jump right in with python chat.py!

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
.github/workflows		.github/workflows
alembic		alembic
app		app
config		config
data		data
docs		docs
examples		examples
frontend		frontend
models		models
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CNAME		CNAME
Dockerfile		Dockerfile
Dockerfile.dev		Dockerfile.dev
Dockerfile.llama		Dockerfile.llama
Makefile		Makefile
README.md		README.md
alembic.ini		alembic.ini
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.yml		docker-compose.yml
main.py		main.py
nginx.conf		nginx.conf
pytest.ini		pytest.ini
requirements.local.txt		requirements.local.txt
requirements.txt		requirements.txt
run_api.py		run_api.py
run_llama_server.py		run_llama_server.py
run_production.py		run_production.py
setup.py		setup.py

HakAl/adk_rag

Folders and files

Latest commit

History

Repository files navigation