Your AI tools need
better context
CodeWeaver gives Claude, Cursor, and other AI agents precise code search—so they find what matters instead of dumping entire files into context.
Open source. MIT OR Apache-2.0.
Now in alpha.
CodeWeaver
A code search server that plugs into Claude, Cursor, and other AI tools via MCP (Model Context Protocol). Ask questions in plain language, get back the exact functions and classes that answer them.
What it does
- Hybrid Search Engine: Combines keyword matching (finds exact names) with meaning-based search (finds related code). Both run automatically on every search.
- Structural Code Parsing: Full syntax parsing using tree-sitter (27 languages) with language-aware chunking for 166+ total languages.
- Automatic Reranking: Second-stage model rescores all results for relevance. Works with 5 different reranking providers.
- Flexible Deployment: Choose your embedding provider: local (FastEmbed, SentenceTransformers), cloud (Voyage, AWS Bedrock, OpenAI), or bring your own. 17 providers supported.
- Compact Tool Interface: Returns precise code snippets with metadata. Single MCP tool with ~500 tokens to describe the tool (vs. 16,000+ for LSP-based tools).
✗Typical Agent Workflow
✓With CodeWeaver
What Makes CodeWeaver Different
Always Hybrid
Every search combines keyword matching (finds exact names) with meaning-based search (finds related code). You don't configure this—it just works.
Always Reranked
Results pass through a second model that scores them for relevance. Quality defaults, no tuning required.
Broad Language Support
Full syntax parsing via tree-sitter (27 languages). Language-aware chunking for 166+ total. Practically any language that follows common patterns.
Works Offline
Run fully local with no external dependencies—or use cloud embeddings for better accuracy. Your choice.
Plugs Into Your Tools
Built on MCP (Model Context Protocol)—the standard for connecting AI tools. Works with Claude Code, Cursor, Copilot, and others.
Opinionated Defaults
Everything is configurable, but nothing needs to be configured. Sane defaults that just work.
The Key Tradeoffs
CodeWeaver
Hybrid search: keyword matching + meaning-based. Ask questions in plain language ('where do we handle auth?') and find relevant code even if nothing is literally named 'auth'.
Strengths
- 166+ languages (27 with full parsing)
- One tool, ~500 tokens overhead
- No language server required
- Finds conceptually related code
Tradeoffs
- Direct symbol lookup coming soon (not yet exposed)
- Semantic match, not always exact
Serena
Language Server Protocol (LSP)—same tech as 'go to definition' in your IDE. Search for exact symbol names (`handleAuth`) and get precisely that function.
Strengths
- Instant, exact symbol lookup
- IDE-like precision
- Also does editing (9 tools)
Tradeoffs
- ~30 languages (needs LSP for each)
- 20+ tools, ~16,000 tokens overhead
- Requires language server running
The full technical rundown
For those who prefer traditional comparison tables, here are the detailed specifications across all tools.
Alpha 5: Works. Not Perfect.
Where We Are
CodeWeaver is in Alpha 5. It works. It's not perfect. It has rough edges.
This is a solo project built by someone who believes in getting details right. Every line is written with care.
Development Timeline
Hybrid search always-on, 166+ languages, 17 providers
Stability improvements, bug fixes, structural and testing improvements
Agent integration for smarter search and intent handling, data layer for including other data sources (like external API docs), documentation improvements
Production-ready, comprehensive docs, multi-repo
What to Expect
What Works
- Hybrid search (keyword + meaning) across codebases
- Automatic reranking for relevance
- 17 embedding providers, 50+ models
- 27 languages with code meaning
- 166+ languages with language-aware chunking
- Fully offline operation (quickstart profile)
- Fallback/backup search when mainline providers fail
What's Rough
- Documentation still improving
- Setup requires some technical comfort
- Performance varies by codebase size and config choices
- Direct symbol lookup not yet exposed
- Some edge cases not handled
One Piece of the Puzzle
I built CodeWeaver because AI agents have a context problem. They re-read huge files to learn what line something was on. Most tools they're given were built for humans.
CodeWeaver is one piece of the puzzle: hybrid code search that just works. Keyword matching and meaning-based search, combined automatically. Reranking built in. Quality defaults you don't have to think about.
This is Alpha 5 software built by one person. It's rough around the edges. But the fundamentals are solid, and I'd rather ship something useful than polish forever.
Let's build it!
What's Next
- Direct symbol lookup (LSP-style "go to definition")
- Way better docs
- Agent integration to assign intent and information needs, decide on search strategy
- Performance tuning and optimization
- Thread: dependency tracking and codebase intelligence