Semantic code search and relationship tracking via MCP and Unix CLI.
- How It Works
- Installation
- Quick Start
- Claude Integration
- Configuration
- Documentation Comments for Better Search
- CLI Commands
- MCP Tools
- Performance
- Architecture Highlights
- Requirements
- Current Limitations
- Roadmap
- Feature Details
- Contributing
- License
- Parse - Tree-sitter AST parsing for Rust, Python, and PHP (more languages coming)
- Extract - Symbols, call graphs, implementations, and type relationships
- Embed - 384-dimensional vectors from doc comments via AllMiniLML6V2
- Index - Tantivy for full-text search + memory-mapped symbol cache for <10ms lookups
- Serve - MCP protocol for AI assistants, ~300ms response time
# Install latest version
cargo install codanna
# Install with HTTP server (OAuth authentication)
cargo install codanna --features http-server
# Install with HTTPS server (TLS + optional OAuth)
cargo install codanna --features https-server
# Install from local path (development)
cargo install --path . --all-features- Initialize:
# Initialize codanna index space and create .codanna/settings.toml
codanna init- Index your codebase:
# Index with progress display
codanna index src --progress
# See what would be indexed (dry run)
codanna index . --dry-run
# Index a specific file
codanna index src/main.rs- Search your code:
# Semantic search with new simplified syntax
codanna mcp semantic_search_docs query:"parse rust files" limit:3 --json
# Find symbols with JSON output
codanna retrieve symbol Parser --json
# Analyze function relationships
codanna mcp find_callers process_file --json | jq '.data[].name'
# Legacy format still works
codanna mcp semantic_search_with_context --args '{"query": "parse rust files and extract symbols", "limit": 3}'Add to your .mcp.json:
{
"mcpServers": {
"codanna": {
"command": "codanna",
"args": ["serve", "--watch", "--watch-interval", "5"]
}
}
}For persistent server with real-time file watching:
# HTTP server with OAuth authentication (requires http-server feature)
codanna serve --http --watch
# HTTPS server with TLS encryption (requires https-server feature)
codanna serve --https --watchConfigure in .mcp.json:
{
"mcpServers": {
"codanna-sse": {
"type": "sse",
"url": "http://127.0.0.1:8080/mcp/sse"
}
}
}For HTTPS configuration, see the HTTPS Server Mode documentation.
We include a codanna-navigator sub agent at .claude/agents/codanna-navigator.md. This agent is optimized for using the codanna MCP server.
Codanna CLI is unix-friendly with positional arguments and JSON output for easy command chaining:
# New simplified syntax - positional arguments for simple tools
codanna mcp find_symbol main --json
codanna mcp get_calls process_file
codanna mcp find_callers init
# Key:value pairs for complex tools
codanna mcp semantic_search_docs query:"error handling" limit:3 --json
codanna mcp search_symbols query:parse kind:function --json
# Unix piping with JSON output
time codanna mcp search_symbols query:parse limit:1 --json | \
jq -r '.data[0].name' | \
xargs -I {} codanna retrieve callers {} --json | \
jq -r '.data[] | "\(.name) in \(.module_path)"'
# Result:
#
# main in crate::main
# serve_http in crate::mcp::http_server::serve_http
# serve_http in crate::mcp::http_server::serve_http
# serve_https in crate::mcp::https_server::serve_https
# serve_https in crate::mcp::https_server::serve_https
# parse in crate::parsing::rust::parse
# parse in crate::parsing::rust::parse
# parse in crate::parsing::python::parse
#
# codanna mcp search_symbols query:parse limit:1 --json 0.10s user 0.08s system 122% cpu 0.143 total
# jq -r '.data[0].name' 0.00s user 0.00s system 3% cpu 0.142 total
# xargs -I {} codanna retrieve callers {} --json 0.11s user 0.07s system 63% cpu 0.288 total
# jq -r '.data[] | "\(.name) in \(.module_path)"' 0.00s user 0.00s system 1% cpu 0.288 total
# Legacy format still supported for backward compatibility
codanna mcp find_symbol --args '{"name": "main"}'All MCP tools support --json flag for structured output, making integration with other tools seamless.
Configure Codanna in .codanna/settings.toml:
[semantic_search]
enabled = true
model = "AllMiniLML6V2"
threshold = 0.6 # Similarity threshold (0-1)
[indexing]
parallel_threads = 16 # Auto-detected by default
include_tests = true # Index test filesCodanna respects .gitignore and adds its own .codannaignore:
# Created automatically by codanna init
.codanna/ # Don't index own data
target/ # Skip build artifacts
node_modules/ # Skip dependencies
*_test.rs # Optionally skip testsSemantic search works by understanding your documentation comments:
/// Parse configuration from a TOML file and validate required fields
/// This handles missing files gracefully and provides helpful error messages
fn load_config(path: &Path) -> Result<Config, Error> {
// implementation...
}With good comments, semantic search can find this function when prompted for:
- "configuration validation"
- "handle missing config files"
- "TOML parsing with error handling"
This encourages better documentation β better AI understanding β more motivation to document.
| Command | Description | Example |
|---|---|---|
codanna init |
Set up .codanna directory with default configuration | codanna init --force |
codanna index <PATH> |
Build searchable index from your codebase | codanna index src --progress |
codanna config |
Display active settings | codanna config |
codanna serve |
Start MCP server for AI assistants | codanna serve --watch |
All retrieve commands support --json flag for structured output (exit code 3 when not found).
| Command | Description | Example |
|---|---|---|
retrieve symbol <NAME> |
Find a symbol by name | codanna retrieve symbol main --json |
retrieve calls <FUNCTION> |
Show what functions a given function calls | codanna retrieve calls parse_file --json |
retrieve callers <FUNCTION> |
Show what functions call a given function | codanna retrieve callers main --json |
retrieve implementations <TRAIT> |
Show what types implement a trait | codanna retrieve implementations Parser --json |
retrieve impact <SYMBOL> |
Show the impact radius of changing a symbol | codanna retrieve impact main --depth 3 --json |
retrieve search <QUERY> |
Search for symbols using full-text search | codanna retrieve search "parse" --limit 5 --json |
retrieve describe <SYMBOL> |
Show comprehensive information about a symbol | codanna retrieve describe SimpleIndexer --json |
| Command | Description | Example |
|---|---|---|
codanna mcp-test |
Verify Claude can connect and list available tools | codanna mcp-test |
codanna mcp <TOOL> |
Execute MCP tools without spawning server | codanna mcp find_symbol main --json |
codanna benchmark |
Benchmark parser performance | codanna benchmark rust --file my_code.rs |
--config,-c: Path to custom settings.toml file--force,-f: Force operation (overwrite, re-index, etc.)--progress,-p: Show progress during operations--threads,-t: Number of threads to use--dry-run: Show what would happen without executing
Available tools when using the MCP server. All tools support --json flag for structured output.
| Tool | Description | Example |
|---|---|---|
find_symbol |
Find a symbol by exact name | codanna mcp find_symbol main --json |
get_calls |
Show functions called by a given function | codanna mcp get_calls process_file |
find_callers |
Show functions that call a given function | codanna mcp find_callers init |
analyze_impact |
Analyze the impact radius of symbol changes | codanna mcp analyze_impact Parser --json |
get_index_info |
Get index statistics and metadata | codanna mcp get_index_info --json |
| Tool | Description | Example |
|---|---|---|
search_symbols |
Search symbols with full-text fuzzy matching | codanna mcp search_symbols query:parse kind:function limit:10 |
semantic_search_docs |
Search using natural language queries | codanna mcp semantic_search_docs query:"error handling" limit:5 |
semantic_search_with_context |
Search with enhanced context | codanna mcp semantic_search_with_context query:"parse files" threshold:0.7 |
| Tool | Parameters |
|---|---|
find_symbol |
name (required) |
search_symbols |
query, limit, kind, module |
semantic_search_docs |
query, limit, threshold |
semantic_search_with_context |
query, limit, threshold |
get_calls |
function_name |
find_callers |
function_name |
analyze_impact |
symbol_name, max_depth |
get_index_info |
None |
Parser benchmarks on a 750-symbol test file:
| Language | Parsing Speed | vs. Target (10k/s) | Status |
|---|---|---|---|
| Rust | 91,318 symbols/sec | 9.1x faster β | Production |
| Python | 75,047 symbols/sec | 7.5x faster β | Production |
| PHP | 68,432 symbols/sec | 6.8x faster β | Production |
| JavaScript | - | - | v0.4.1 |
| TypeScript | - | - | v0.4.1 |
Key achievements:
- Zero-cost abstractions: All parsers use borrowed string slices with no allocations in hot paths
- Parallel processing: Multi-threaded indexing that scales with CPU cores
- Memory efficiency: Approximately 100 bytes per symbol including all metadata
- Real-time capability: Fast enough for incremental parsing during editing
- Optimized CLI startup: ~300ms for all operations (53x improvement from v0.2)
- JSON output: Zero overhead - structured output adds <1ms to response time
Run performance benchmarks:
codanna benchmark all # Test all parsers
codanna benchmark python # Test specific languageMemory-mapped storage: Two caches for different access patterns:
symbol_cache.bin- FNV-1a hashed symbol lookups, <10ms response timesegment_0.vec- 384-dimensional vectors, <1ΞΌs access after OS page cache warm-up
Embedding lifecycle management: Old embeddings deleted when files are re-indexed to prevent accumulation.
Lock-free concurrency: DashMap for concurrent symbol reads, write coordination via single writer lock.
Single-pass indexing: Symbols, relationships, and embeddings extracted in one AST traversal.
Hot reload: File watcher with 500ms debounce triggers re-indexing of changed files only.
- Rust 1.75+ (for development)
- ~150MB for model storage (downloaded on first use)
- A few MB for index storage (varies by codebase size)
- Supports Rust, Python, and PHP (JavaScript/TypeScript coming in v0.4.1)
- Semantic search requires English documentation/comments
- Windows support is experimental
- 0.3.x - CLI improvements and API stability
- 0.4.x - Language expansion via modular architecture
- 0.5.x - Enterprise features and advanced analysis
| Feature | Description | Status |
|---|---|---|
| JSON Output Support | Structured output for all commands | β |
| Exit Codes | Semantic exit codes for scripting | β |
| Unix-Friendly CLI | Positional args and key:value syntax | β |
| Incremental Index Updates | File watching with auto re-indexing | β |
| Feature | Description | Status |
|---|---|---|
| Language Registry Architecture | Modular parser system for easy language additions | β |
| PHP Support | Full PHP parser implementation | β |
| Python Enhancements | Complete Python class and decorator support | π§ |
| Feature | Description | Status |
|---|---|---|
| JavaScript Support | Full JavaScript/ES6+ parser | π |
| TypeScript Support | TypeScript with type annotations | π |
| Feature | Description | Status |
|---|---|---|
| Go Support | Go language with interfaces and goroutines | π |
| Feature | Description | Status |
|---|---|---|
| C# Support | C# with .NET ecosystem support | π |
| Feature | Description | Status |
|---|---|---|
| Java Support | Java with class hierarchies | π |
| Feature | Description | Status |
|---|---|---|
| C/C++ Support | C and C++ with headers and templates | π |
| Feature | Description | Status |
|---|---|---|
| Direct Semantic Search | retrieve semantic command |
π |
| Batch Operations | Process multiple symbols in one call | π |
| Output Format Control | Compact/full/json output modes | π |
| Query Language | Advanced search with complex filters | π |
| Configuration Profiles | Environment-specific settings | π |
| Machine-Readable Progress | JSON progress output | π |
| Cross-Language References | Track references across languages | π |
| Language Server Protocol | LSP integration for IDEs | π |
Legend: β Complete | π§ In Progress | π Planned
- Rust - Full support with trait implementations and generics
- Python - Functions, classes, and imports
- PHP - Classes, functions, and namespaces
Based on developer demand and tree-sitter support:
- JavaScript/TypeScript (v0.4.1) - Most requested for web development
- Go (v0.4.2) - Growing popularity in cloud/backend
- C# (v0.4.3) - Enterprise and game development
- Java (v0.4.4) - Enterprise applications
- C/C++ (v0.4.5) - Systems programming
All retrieve commands and MCP tools support --json flag for structured output with consistent format and proper exit codes (v0.3.0).
Semantic exit codes for scripting: 0 (success), 1 (general error), 3 (not found). Enables reliable automation (v0.3.0).
Simplified syntax with positional arguments for simple tools and key:value pairs for complex tools. No JSON escaping needed (v0.3.0).
Watch mode with automatic re-indexing of changed files. Broadcast channels coordinate updates with 500ms debouncing (v0.3.0).
Modular parser system where languages self-register via a registry. Enables easy addition of new languages without core code changes (v0.4.0).
Full PHP parser with classes, functions, namespaces, and traits. Supports PHP 5 through PHP 8 syntax (v0.4.0).
Direct retrieve semantic command for natural language code search without going through MCP interface.
Process multiple symbols in a single command to reduce overhead and improve CI/CD performance.
Choose between compact (script-friendly), full (human-readable), and json output formats.
Full JavaScript/ES6+ parser with modules, classes, async/await, and JSX support.
TypeScript parser with full type annotation support, interfaces, and decorators.
Go language parser with interfaces, goroutines, channels, and struct methods.
C# parser with .NET ecosystem support, LINQ, async/await, and attributes.
Java parser with class hierarchies, interfaces, generics, and annotations.
C and C++ parsers with headers, templates, macros, and cross-compilation units.
Advanced search syntax with wildcards, boolean operators, and complex filters.
Environment-specific settings (dev, test, production) with profile inheritance.
JSON-formatted progress output for better CI/CD integration and monitoring.
Track and analyze references across different programming languages in polyglot codebases.
LSP implementation for IDE integration with real-time code intelligence.
This is an early release focused on core functionality. Contributions welcome! See CONTRIBUTING for guidelines.
Licensed under the Apache License, Version 2.0 - See LICENSE file for details.
Attribution required when using Codanna in your project. See NOTICE file.
Built with π¦ by developers who wanted their AI assistants to actually understand their code.