Search your codebase with natural language - "How do I validate JWT tokens?" instead of searching for exact function names.
CodePilot is a semantic code search engine that lets you search your codebase using natural language. Instead of searching for exact function names or keywords, ask questions like:
- "How do I validate JWT tokens?"
- "Where are API routes defined?"
- "How does error handling work?"
- "Show me authentication middleware"
CodePilot.mov
- 🧠 Natural Language Search - Ask questions in plain English
- 🌐 GitHub Integration - Search any public GitHub repository instantly
- 🌍 Multi-Language Support - Python, TypeScript, JavaScript, Go, Java, Rust, C++, Ruby, PHP
- ⚡ Lightning Fast - 31.5ms average search latency
- 🎨 Beautiful UI - Modern web interface with syntax highlighting
- 🔄 Real-time Indexing - Index any repository in seconds
- 🎯 Advanced Filtering - Filter by language, path, and result count
- 📊 Performance Metrics - Built-in evaluation and benchmarking
- 🐳 Docker Ready - One-command deployment
- 🔌 RESTful API - Complete API for integrations
- ☁️ Cloud-Ready - Deploy to Vercel, Railway, or any cloud platform
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Web Frontend │ │ FastAPI API │ │ Vector Search │
│ (Next.js) │◄──►│ (Python) │◄──►│ (FAISS) │
│ │ │ │ │ │
│ • Search UI │ │ • /search │ │ • Embeddings │
│ • Ingestion │ │ • /ingest │ │ • Indexing │
│ • Filters │ │ • /status │ │ • Similarity │
│ • Syntax Highl. │ │ • /health │ │ • Search │
└─────────────────┘ └─────────────────┘ └─────────────────┘
Before you begin, make sure you have:
- Python 3.11 or higher - Download Python
- Node.js 18+ and npm - Download Node.js
- Git - Download Git
Check your versions:
python --version # or python3 --version
node --version
npm --version
git --versionClone the repository:
git clone https://github.com/amomin2004/codepilot.git
cd codepilotBackend (Python):
# Install Python dependencies
pip install -r requirements.txt
# or if using pip3:
pip3 install -r requirements.txtFrontend (Node.js):
# Navigate to web directory and install
cd web
npm install
cd ..You need two terminal windows running simultaneously.
Terminal 1 - Start Backend:
# From the project root directory
python3 -m uvicorn api.main:app --reload --host 0.0.0.0 --port 8000You should see:
INFO: Uvicorn running on http://0.0.0.0:8000
INFO: 🚀 CodePilot API ready!
Terminal 2 - Start Frontend:
# From the project root directory
cd web
npm run devYou should see:
▲ Next.js 15.x.x
- Local: http://localhost:3000
Open your browser and go to:
- Frontend: http://localhost:3000
- API Docs: http://localhost:8000/docs (Interactive API documentation)
If you prefer Docker:
# Clone repository
git clone https://github.com/amomin2004/codepilot.git
cd codepilot
# Start with Docker Compose
docker-compose up -d
# Access the application
open http://localhost:3000CodePilot can index any GitHub repository or local project on your computer. You need to index a repository before you can search it.
-
Open the application: http://localhost:3000
-
Click "Ingest" in the top navigation
-
Enter a repository:
For a GitHub Repository:
https://github.com/tiangolo/fastapiFor a Local Project:
/Users/yourname/projects/myappOr on Windows:
C:\Users\yourname\projects\myapp -
Click "Start Indexing"
-
Wait for completion (usually 30-60 seconds depending on repo size)
-
You'll see a success message with statistics
Index a GitHub Repository:
curl -X POST http://localhost:8000/ingest \
-H "Content-Type: application/json" \
-d '{"repo_path": "https://github.com/tiangolo/fastapi"}'Index a Local Project:
# macOS/Linux
curl -X POST http://localhost:8000/ingest \
-H "Content-Type: application/json" \
-d '{"repo_path": "/Users/yourname/projects/myapp"}'
# Windows (PowerShell)
curl.exe -X POST http://localhost:8000/ingest `
-H "Content-Type: application/json" `
-d '{\"repo_path\": \"C:\\Users\\yourname\\projects\\myapp\"}'Perfect for testing with authentication/JWT questions:
| Repository | URL | Best For |
|---|---|---|
| FastAPI | https://github.com/tiangolo/fastapi |
JWT, OAuth2, API security |
| Django REST | https://github.com/encode/django-rest-framework |
Authentication, permissions |
| NestJS | https://github.com/nestjs/nest |
TypeScript, guards, JWT strategies |
| Express | https://github.com/expressjs/express |
Middleware, routing patterns |
| Next.js | https://github.com/vercel/next.js |
React, API routes, auth |
Pro Tip: Start with FastAPI - it has excellent security examples and is perfect for JWT-related queries!
Once you've indexed a repository, you can search it using natural language!
- Go to the Search page: http://localhost:3000 (home page)
- Type your question in natural language:
How do I validate JWT tokens? - Press Enter or click "Search"
- View results with syntax highlighting and file locations
Refine your search with filters:
-
Language Filter:
- Select
Python,TypeScript,JavaScript, etc. - Only shows results from that language
- Select
-
Path Filter:
- Enter:
security/to search only in security directory - Enter:
authto find files with "auth" in the path
- Enter:
-
Result Count:
- Choose 5, 10, or 20 results per search
Example with Filters:
Query: "JWT authentication"
Language: Python
Path Contains: security
Results: 10
Simple Search:
curl "http://localhost:8000/search?q=How%20do%20I%20validate%20JWT%20tokens&k=5"Search with Filters:
# Filter by language (Python only)
curl "http://localhost:8000/search?q=authentication&lang=python&k=10"
# Filter by path (security directory only)
curl "http://localhost:8000/search?q=JWT&pathContains=security&k=5"
# Combine filters
curl "http://localhost:8000/search?q=OAuth&lang=python&pathContains=auth&k=10"Try these questions on an indexed repository:
Authentication & Security:
- "How do I validate JWT tokens?"
- "Where is OAuth2 authentication implemented?"
- "How do I secure API endpoints?"
- "Show me password hashing examples"
- "Where are authentication middleware defined?"
API Development:
- "How do I create REST endpoints?"
- "Where are API routes defined?"
- "How do I handle request validation?"
- "Show me error response examples"
- "How do I add CORS headers?"
Database:
- "How do I connect to a database?"
- "Where are database models defined?"
- "How do I handle database sessions?"
- "Show me migration examples"
Testing:
- "How do I write unit tests?"
- "Where are test fixtures defined?"
- "How do I mock dependencies?"
Here's a complete example from start to finish:
1. Start CodePilot (if not already running):
# Terminal 1
python3 -m uvicorn api.main:app --reload --host 0.0.0.0 --port 8000
# Terminal 2
cd web && npm run dev2. Index FastAPI Repository:
Open http://localhost:3000/ingest and enter:
https://github.com/tiangolo/fastapi
Click "Start Indexing" and wait ~60 seconds.
3. Search for JWT Information:
Go to http://localhost:3000 and try:
- "How do I validate JWT tokens?"
- "Where is OAuth2 implemented?"
- "Show me bearer token authentication"
4. View Results:
You'll see code snippets with:
- ✅ File paths (e.g.,
docs_src/security/tutorial005.py) - ✅ Line numbers
- ✅ Syntax highlighting
- ✅ Relevance scores
- ✅ Direct code previews
5. Refine with Filters:
Add filters:
- Language:
Python - Path:
security - Results:
10
Based on comprehensive evaluation with 25 test queries:
| Metric | Result | Target | Status |
|---|---|---|---|
| Precision@5 | 52% | 80% | |
| Precision@10 | 53% | 90% | |
| Mean Reciprocal Rank | 0.357 | 0.700 | |
| Latency P50 | 14.8ms | ≤200ms | ✅ Excellent |
| Latency P95 | 53.3ms | ≤500ms | ✅ Excellent |
| Latency P99 | 282.3ms | ≤1000ms | ✅ Excellent |
- Routing: 100% precision - Perfect API endpoint queries
- Error Handling: 100% precision - Excellent exception handling
- Authentication: 67% precision - Good security-related queries
- Middleware: 50% precision - Moderate middleware queries
# Production with Nginx
docker-compose -f docker-compose.prod.yml up -d
# Access via Nginx proxy
open http://localhost# Development with hot reload
docker-compose -f docker-compose.dev.yml up -d# Build containers
./scripts/docker-build.sh
# Start services
./scripts/docker-start.sh
# Stop services
./scripts/docker-stop.sh# Run evaluation with 25 test queries
python evaluation/cli_eval.py
# Generate HTML report
python evaluation/cli_eval.py --report
# Verbose output
python evaluation/cli_eval.py --verbose# Test all components
python evaluation/test_eval.py# Use custom golden set
python evaluation/cli_eval.py --golden-set my_queries.json
# Save results
python evaluation/cli_eval.py --output my_results.json| Endpoint | Method | Description |
|---|---|---|
/health |
GET | Health check |
/status |
GET | System status and indexing info |
/ingest |
POST | Index a repository |
/search |
GET | Semantic search with filters |
GET /search?q=query&k=5&lang=python&pathContains=authq- Search query (required)k- Number of results (default: 5)lang- Language filter (python, typescript, etc.)pathContains- Path filter (auth, middleware, etc.)
# Search for JWT validation
curl "http://localhost:8000/search?q=JWT%20token%20validation&k=5"
# Search with filters
curl "http://localhost:8000/search?q=authentication&lang=python&pathContains=auth"
# Index repository
curl -X POST http://localhost:8000/ingest \
-H "Content-Type: application/json" \
-d '{"repo_path": "/path/to/project"}'codepilot/
├── 🐍 api/ # FastAPI backend
│ ├── main.py # Main application (311 lines)
│ ├── ingest.py # Repository ingestion (280 lines)
│ ├── embeddings.py # Vector embeddings (150 lines)
│ ├── vector_index.py # FAISS operations (120 lines)
│ ├── search.py # Search logic (280 lines)
│ └── cli_ingest.py # CLI ingestion tool
├── 🌐 web/ # Next.js frontend
│ ├── src/app/ # App router pages
│ │ ├── page.tsx # Search page (342 lines)
│ │ ├── ingest/page.tsx # Ingestion page (314 lines)
│ │ └── layout.tsx # Root layout
│ ├── src/components/ # React components
│ │ └── Navigation.tsx # Top navigation (55 lines)
│ └── src/lib/ # Utilities
│ └── api.ts # API client (85 lines)
├── 📊 evaluation/ # Evaluation framework
│ ├── goldens.json # Test queries (25 queries)
│ ├── eval.py # Evaluation engine (400+ lines)
│ ├── cli_eval.py # CLI tool (300+ lines)
│ └── test_eval.py # Test suite (250+ lines)
├── 🐳 Docker files
│ ├── Dockerfile.api # API container
│ ├── Dockerfile.web # Web container
│ ├── docker-compose.yml # Development
│ ├── docker-compose.prod.yml # Production
│ └── nginx.conf # Reverse proxy
├── 📦 scripts/ # Deployment scripts
│ ├── docker-build.sh # Build containers
│ ├── docker-start.sh # Start services
│ └── docker-stop.sh # Stop services
├── 📚 data/ # Sample repositories
└── 📄 output/ # Index and chunks storage
Total: ~4,000 lines of production code
Problem: ModuleNotFoundError or import errors
ModuleNotFoundError: No module named 'api.ingest'Solution: Make sure you're running from the project root and using the correct command:
cd /path/to/codepilot
python3 -m uvicorn api.main:app --reload --host 0.0.0.0 --port 8000Problem: Address already in use (Port 8000)
Solution: Kill the process using port 8000:
# macOS/Linux
lsof -ti:8000 | xargs kill -9
# Windows
netstat -ano | findstr :8000
taskkill /PID <PID> /FProblem: npm install fails or modules not found
Solution:
cd web
rm -rf node_modules package-lock.json
npm installProblem: Port 3000 already in use
Solution: Use a different port:
cd web
PORT=3001 npm run devProblem: "No chunks created" when indexing
Solutions:
- ✅ Make sure the path exists and is correct
- ✅ For local paths, use absolute paths:
/Users/name/projectnot~/project - ✅ For GitHub URLs, use full HTTPS URLs:
https://github.com/user/repo - ✅ Check that the repository has supported file types (
.py,.ts,.js, etc.)
Problem: GitHub repository clone fails
Solution: Make sure the repository is public or check your internet connection:
# Test if you can clone manually
git clone https://github.com/tiangolo/fastapi /tmp/test-repoProblem: Search returns no results
Solutions:
- Make sure you've indexed a repository first
- Check indexing status: http://localhost:8000/status
- Try a simpler query: "authentication" instead of "How do I implement OAuth2?"
- Remove filters and try again
Problem: Frontend can't connect to backend
Solution:
- Verify backend is running:
curl http://localhost:8000/health - Check
web/.env.localexists with:NEXT_PUBLIC_API_URL=http://localhost:8000 - Restart the frontend after creating
.env.local
If you encounter other issues:
- Check the backend logs in Terminal 1
- Check the frontend logs in Terminal 2
- Visit the API docs: http://localhost:8000/docs
- Check system status: http://localhost:8000/status
# Backend development
python api/main.py
# Frontend development
cd web && npm run dev
# Run evaluation
python evaluation/cli_eval.py --verbose- Backend: Add to
api/directory - Frontend: Add to
web/src/directory - Tests: Add to
evaluation/directory - Docker: Update Dockerfiles as needed
# Python formatting
black api/
isort api/
# TypeScript checking
cd web && npm run type-check
# Run all tests
python evaluation/test_eval.py- New Team Members: Understand large codebases quickly
- Code Review: Find relevant code before reviewing
- Debugging: Locate error handling and similar patterns
- Refactoring: Understand dependencies and relationships
- Knowledge Discovery: Find existing solutions in codebase
- Code Reuse: Locate reusable components and patterns
- Documentation: Understand complex flows and architectures
- Onboarding: Accelerate new developer productivity
- Contributors: Understand unfamiliar codebases
- Maintainers: Help new contributors find relevant code
- Users: Learn how to use complex libraries
- Researchers: Analyze code patterns and practices
- Repository ingestion and chunking
- Vector embeddings and indexing
- Semantic search API
- Web interface
- Evaluation framework
- Hybrid search (keyword + semantic)
- Code completion integration
- Multi-repository search
- Advanced filtering options
- IDE extensions (VSCode, IntelliJ)
- CLI tool for terminal usage
- GitHub/GitLab integration
- Code generation from queries
- User authentication and authorization
- Team collaboration features
- Advanced analytics and insights
- Custom model training
- FastAPI - Excellent Python web framework
- Next.js - React framework for production
- Sentence Transformers - State-of-the-art embeddings
- FAISS - Efficient similarity search
- Tailwind CSS - Utility-first CSS framework
- Lucide React - Beautiful icon library
Built by Ali Asgar Momin