Production-ready intelligent document management system with multi-agent architecture, powered by Google Gemini 2.5 and advanced RAG techniques.
Project Polaris is a comprehensive document intelligence system designed for Part 2 of the AI Agent Engineer Assessment. It provides:
- Advanced RAG Pipeline: HyDE, hybrid search, reranking, and reciprocal rank fusion
- Multi-Agent Architecture: Specialized agents for routing, querying, and summarization
- Production-Ready: FastAPI backend, Streamlit UI, Docker support, monitoring
- Gemini 2.5 Integration: Flash for speed, Pro for quality
- Supabase PGVector: Direct connection to existing n8n workflow database
- β¨ HyDE (Hypothetical Document Embeddings) - Improved query understanding
- π Hybrid Search - Vector similarity + keyword matching with BM25
- π― Cross-Encoder Reranking - Accurate relevance scoring
- π Reciprocal Rank Fusion - Intelligent result combination
- π Multi-Query Generation - Multiple query perspectives
- π§ Router Agent - Intelligent query classification and routing
- π¬ Query Agent - Information retrieval with Gemini Flash
- π Summary Agent - Document analysis with Gemini Pro
- π οΈ Tool Agent - Extensible function execution
- π FastAPI Backend - High-performance REST API
- π¨ Streamlit UI - Beautiful, interactive interface
- π Prometheus Metrics - Comprehensive monitoring
- π³ Docker Support - Containerized deployment
- π Security - JWT authentication, rate limiting
- π Comprehensive Logging - Structured logging with rotation
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β User Interface Layer β
β FastAPI REST API β Streamlit UI β CLI Tool β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AI Agent Layer (LangChain) β
β Router Agent β Query Agent β Summary Agent β
β (Gemini Flash) (Gemini Flash) (Gemini Pro) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Advanced RAG Pipeline β
β HyDE β Hybrid Search β Reranking β Generation β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Data Layer β
β Supabase PGVector β Redis Cache β Conv. History β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Python 3.11+ (M4 Pro optimized)
- Google AI API Key (Get one here)
- Supabase Project with PGVector (from n8n workflow)
- Redis (optional, for caching)
# Clone repository
git clone https://github.com/your-username/project-polaris.git
cd project-polaris
# Create virtual environment (Python 3.11+ recommended)
python3.11 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
# Copy and configure environment
cp .env.example .env
nano .env # Add your API keys and database URLEdit .env file with your credentials:
# Google Gemini API (Required)
GOOGLE_API_KEY="your_gemini_api_key_here"
# Supabase Database (Required)
DATABASE_URL="postgresql://postgres:[email protected]:5432/postgres"
# Collection name (must match n8n workflow)
COLLECTION_NAME="google_drive_documents"
# Optional: Redis for caching
REDIS_URL="redis://localhost:6379/0"
# Optional: Environment settings
ENVIRONMENT="development"
LOG_LEVEL="INFO"# Test database connection and API keys
python scripts/test_connection.py
# Expected output:
# β
Database connection successful
# β
Vector store accessible
# β
Google Gemini API working
# β
Collection 'google_drive_documents' found with X documents
# Quick functionality test
python scripts/quick_test.py# Terminal 1: Start FastAPI server
source venv/bin/activate
python -m uvicorn src.api.main:app --reload --host 127.0.0.1 --port 8000
# Terminal 2: Start Streamlit UI
source venv/bin/activate
streamlit run ui/streamlit_app.py --server.port 8501
# Access the application:
# - API Documentation: http://localhost:8000/docs
# - Streamlit UI: http://localhost:8501
# - System Info: http://localhost:8000/api/v1/system/info# Start API server (production settings)
source venv/bin/activate
python -m uvicorn src.api.main:app --host 0.0.0.0 --port 8000 --workers 4
# Start Streamlit (in another terminal)
source venv/bin/activate
streamlit run ui/streamlit_app.py --server.port 8501 --server.headless true# Build and run all services
docker-compose up --build
# Or run in background
docker-compose up -d --build
# Access:
# - API: http://localhost:8000
# - UI: http://localhost:8501
# - Prometheus: http://localhost:9090
# - Redis: localhost:6379# Build and run with Docker Compose
docker-compose up --build
# Or individual services
docker build -t polaris-api -f Dockerfile .
docker run -p 8000:8000 --env-file .env polaris-apiproject-polaris/
βββ .env.example # Environment template
βββ .gitignore # Git ignore patterns
βββ Architecture.md # Detailed system architecture
βββ SETUP_GUIDE.md # Comprehensive setup guide
βββ docker-compose.yml # Docker orchestration
βββ Dockerfile # API container
βββ Dockerfile.streamlit # UI container
βββ requirements.txt # Python dependencies
βββ config/ # Configuration management
β βββ __init__.py
β βββ settings.py # Pydantic settings
β βββ logging_config.py # Logging configuration
βββ src/ # Source code
β βββ __init__.py
β βββ core/ # Core components
β β βββ __init__.py
β β βββ embeddings.py # Gemini embeddings
β β βββ llm.py # Gemini LLM wrapper
β β βββ vector_store.py # PGVector integration
β βββ agents/ # Multi-agent system
β β βββ __init__.py
β β βββ base_agent.py # Base agent class
β β βββ router_agent.py # Query routing
β β βββ query_agent.py # Information retrieval
β β βββ summary_agent.py# Summarization
β βββ rag/ # RAG pipeline
β β βββ __init__.py
β β βββ retriever.py # Advanced retriever
β β βββ hyde.py # HyDE implementation
β β βββ reranker.py # Cross-encoder reranking
β β βββ fusion.py # Rank fusion
β βββ chains/ # LangChain chains
β β βββ __init__.py
β β βββ qa_chain.py # Q&A chain
β β βββ summary_chain.py# Summary chain
β βββ prompts/ # Prompt templates
β β βββ __init__.py
β β βββ query_prompts.py # Q&A prompts
β β βββ summary_prompts.py # Summary prompts
β β βββ agent_prompts.py # Agent system prompts
β βββ tools/ # Agent tools
β β βββ __init__.py
β β βββ search_tools.py # Search utilities
β βββ utils/ # Utility functions
β β βββ __init__.py
β β βββ cache.py # Redis caching
β β βββ metrics.py # Prometheus metrics
β β βββ helpers.py # Helper functions
β βββ api/ # FastAPI application
β βββ __init__.py
β βββ main.py # FastAPI app
β βββ dependencies.py # API dependencies
β βββ routes/ # API routes
β βββ __init__.py
β βββ health.py # Health endpoints
β βββ query.py # Query endpoints
β βββ summary.py # Summary endpoints
βββ ui/ # Streamlit interface
β βββ __init__.py
β βββ streamlit_app.py # Main Streamlit app
β βββ components/ # UI components
β β βββ __init__.py
β β βββ chat.py # Chat interface
β β βββ sidebar.py # Sidebar components
β β βββ metrics.py # Metrics display
β βββ styles/ # CSS styles
β βββ main.css # Custom styles
βββ scripts/ # Utility scripts
β βββ __init__.py
β βββ test_connection.py # Connection tests
β βββ quick_test.py # Quick functionality test
β βββ setup_db.py # Database setup
βββ tests/ # Test suite
β βββ __init__.py
β βββ conftest.py # Test configuration
β βββ test_agents/ # Agent tests
β βββ test_rag/ # RAG tests
β βββ test_api/ # API tests
β βββ test_utils/ # Utility tests
βββ logs/ # Log files (created at runtime)
βββ docs/ # Documentation
βββ api.md # API documentation
βββ deployment.md # Deployment guide
βββ troubleshooting.md# Troubleshooting guide
import requests
response = requests.post(
"http://localhost:8000/api/v1/query",
json={
"query": "What are the key findings in Q4 reports?",
"chat_history": [], # Optional conversation history
"filters": {}, # Optional metadata filters
"include_sources": True,
"include_followup": True
}
)
result = response.json()
print(result["answer"])
print(f"Sources: {result['num_sources']}")
print("Follow-up questions:", result["followup_questions"])response = requests.post(
"http://localhost:8000/api/v1/summarize",
json={
"query": "Summarize all client feedback",
"summary_type": "executive", # Options: brief, comprehensive, executive
"max_docs": 10,
"filters": {} # Optional metadata filters
}
)
result = response.json()
print(result["summary"])
print("Key Points:", result["key_points"])
print("Insights:", result["insights"])response = requests.get("http://localhost:8000/api/v1/system/info")
system_info = response.json()
print(f"Status: {system_info['status']}")
print(f"Vector Store: {system_info['vector_store']['total_documents']} documents")
print(f"Models: {system_info['models']}")# Activate virtual environment
source venv/bin/activate
# Run all tests
pytest tests/ -v
# Run with coverage report
pytest tests/ --cov=src --cov-report=html --cov-report=term
# Run specific test categories
pytest tests/test_agents/ -v # Agent tests
pytest tests/test_rag/ -v # RAG pipeline tests
pytest tests/test_api/ -v # API endpoint tests
pytest tests/test_utils/ -v # Utility function tests
# Run tests with markers
pytest -m "not slow" -v # Skip slow tests
pytest -m "integration" -v # Run only integration testsTests use the following configuration:
- Test database: Separate Supabase project or local PostgreSQL
- Mock API keys for Gemini (set in
tests/conftest.py) - Redis: Uses fakeredis for testing
- Fixtures: Shared test data in
tests/fixtures/
# Load testing with locust (install: pip install locust)
locust -f tests/performance/locustfile.py --host=http://localhost:8000
# Memory profiling
python -m memory_profiler scripts/profile_memory.py
# API response time testing
python tests/performance/test_response_times.pyAccess metrics at: http://localhost:9090/metrics
Available metrics:
- Request latency
- Token usage
- Agent execution times
- Error rates
- Cache hit rates
# Basic health check
curl http://localhost:8000/api/v1/health
# Response:
# {
# "status": "healthy",
# "timestamp": "2024-01-15T10:30:00Z"
# }
# System information (detailed health)
curl http://localhost:8000/api/v1/system/info
# Response includes:
# - System status and uptime
# - Vector store statistics
# - Model information
# - Performance metrics
# - Cache statistics# In .env or settings
ENABLE_HYDE=true # Enable HyDE
ENABLE_RERANKING=true # Enable cross-encoder reranking
ENABLE_HYBRID_SEARCH=true # Enable hybrid search
TOP_K_RETRIEVAL=20 # Initial retrieval count
TOP_K_FINAL=5 # Final results
SIMILARITY_THRESHOLD=0.7 # Minimum similarity scoreGEMINI_MODEL_FLASH="gemini-2.5-flash-latest" # Fast queries
GEMINI_MODEL_PRO="gemini-2.5-pro-latest" # Summaries
GEMINI_TEMPERATURE=0.1 # Response randomnessrailway init
railway variables set GOOGLE_API_KEY="your_key"
railway variables set DATABASE_URL="your_db_url"
railway upgcloud run deploy polaris-api \
--source . \
--platform managed \
--region us-central1 \
--allow-unauthenticatedSee docs/deployment.md for detailed AWS deployment guide.
Full interactive API documentation is available at /docs when running the server.
Currently, the API is open for development. In production, JWT authentication can be enabled:
# Enable authentication in .env
ENABLE_AUTH=true
JWT_SECRET_KEY="your-secret-key"
JWT_ALGORITHM="HS256"API endpoints are rate-limited to prevent abuse:
- Query endpoints: 60 requests per minute
- Summary endpoints: 30 requests per minute
- Health endpoints: 120 requests per minute
All endpoints return consistent error responses:
{
"detail": "Error description",
"error_code": "SPECIFIC_ERROR_CODE",
"timestamp": "2024-01-15T10:30:00Z"
}Common HTTP status codes:
200- Success400- Bad Request (invalid input)401- Unauthorized (if auth enabled)422- Validation Error429- Rate Limit Exceeded500- Internal Server Error
POST /api/v1/query- Query documents with advanced RAG- Request:
{"query": "string", "chat_history": [], "filters": {}, "include_sources": true, "include_followup": true} - Response: Answer with sources and follow-up questions
- Request:
POST /api/v1/summarize- Generate document summaries- Request:
{"query": "string", "summary_type": "comprehensive|brief|executive", "max_docs": 10, "filters": {}} - Response: Summary with key points and insights
- Request:
GET /api/v1/health- Basic health check- Response:
{"status": "healthy", "timestamp": "ISO-8601"}
- Response:
GET /api/v1/system/info- Detailed system information- Response: System stats, model info, and performance metrics
# Install dev dependencies
pip install -r requirements.txt
pip install black ruff mypy pytest
# Setup pre-commit hooks
pre-commit install
# Run linters
black src/
ruff check src/
mypy src/- Create agent in
src/agents/ - Inherit from
BaseAgent - Implement
execute()method - Register in router agent
- Add tests
Edit templates in src/prompts/:
query_prompts.py- Q&A promptssummary_prompts.py- Summary promptsagent_prompts.py- Agent system prompts
- JWT authentication for API endpoints
- Rate limiting per client
- Input validation with Pydantic
- SQL injection prevention
- Secure password hashing
- Environment variable encryption
MIT License - see LICENSE file for details
Contributions welcome! Please:
- Fork the repository
- Create feature branch
- Add tests
- Submit pull request
- Documentation: Check
/docsdirectory - Issues: Create GitHub issue
- API Docs: http://localhost:8000/docs
- Built with LangChain, FastAPI, and Streamlit
- Powered by Google Gemini 2.5 models
- Vector storage with Supabase PGVector
- Integrates with n8n workflow from Part 1
Project Polaris - Advanced Document Intelligence System Made with β€οΈ