Semantic code search and relationship tracking via MCP and Unix CLI.
- How It Works
- Installation
- Quick Start
- Claude Integration
- Configuration
- Documentation Comments for Better Search
- CLI Commands
- MCP Tools
- Performance
- Architecture Highlights
- Requirements
- Current Limitations
- Roadmap
- Feature Details
- Contributing
- License
- Parse - Tree-sitter AST parsing for Rust, Python, and PHP (more languages coming)
- Extract - Symbols, call graphs, implementations, and type relationships
- Embed - 384-dimensional vectors from doc comments via AllMiniLML6V2
- Index - Tantivy for full-text search + memory-mapped symbol cache for <10ms lookups
- Serve - MCP protocol for AI assistants, ~300ms response time
# Install latest version
cargo install codanna
# Install with HTTP server (OAuth authentication)
cargo install codanna --features http-server
# Install with HTTPS server (TLS + optional OAuth)
cargo install codanna --features https-server
# Install from local path (development)
cargo install --path . --all-features- Initialize:
# Initialize codanna index space and create .codanna/settings.toml
codanna init- Index your codebase:
# Index with progress display
codanna index src --progress
# See what would be indexed (dry run)
codanna index . --dry-run
# Index a specific file
codanna index src/main.rs- Search your code:
# Semantic search with new simplified syntax
codanna mcp semantic_search_docs query:"parse rust files" limit:3 --json
# Find symbols with JSON output
codanna retrieve symbol Parser --json
# Analyze function relationships
codanna mcp find_callers process_file --json | jq '.data[].name'
# Legacy format still works
codanna mcp semantic_search_with_context --args '{"query": "parse rust files and extract symbols", "limit": 3}'Add to your .mcp.json:
{
"mcpServers": {
"codanna": {
"command": "codanna",
"args": ["serve", "--watch", "--watch-interval", "5"]
}
}
}For persistent server with real-time file watching:
# HTTP server with OAuth authentication (requires http-server feature)
codanna serve --http --watch
# HTTPS server with TLS encryption (requires https-server feature)
codanna serve --https --watchConfigure in .mcp.json:
{
"mcpServers": {
"codanna-sse": {
"type": "sse",
"url": "http://127.0.0.1:8080/mcp/sse"
}
}
}For HTTPS configuration, see the HTTPS Server Mode documentation.
We include a codanna-navigator sub agent at .claude/agents/codanna-navigator.md. This agent is optimized for using the codanna MCP server.
Codanna CLI is unix-friendly with positional arguments and JSON output for easy command chaining:
# New simplified syntax - positional arguments for simple tools
codanna mcp find_symbol main --json
codanna mcp get_calls process_file
codanna mcp find_callers init
# Key:value pairs for complex tools
codanna mcp semantic_search_docs query:"error handling" limit:3 --json
codanna mcp search_symbols query:parse kind:function --json
# Unix piping with JSON output
time codanna mcp search_symbols query:parse limit:1 --json | \
jq -r '.data[0].name' | \
xargs -I {} codanna retrieve callers {} --json | \
jq -r '.data[] | "\(.name) in \(.module_path)"'
# Result:
#
# main in crate::main
# serve_http in crate::mcp::http_server::serve_http
# serve_http in crate::mcp::http_server::serve_http
# serve_https in crate::mcp::https_server::serve_https
# serve_https in crate::mcp::https_server::serve_https
# parse in crate::parsing::rust::parse
# parse in crate::parsing::rust::parse
# parse in crate::parsing::python::parse
#
# codanna mcp search_symbols query:parse limit:1 --json 0.10s user 0.08s system 122% cpu 0.143 total
# jq -r '.data[0].name' 0.00s user 0.00s system 3% cpu 0.142 total
# xargs -I {} codanna retrieve callers {} --json 0.11s user 0.07s system 63% cpu 0.288 total
# jq -r '.data[] | "\(.name) in \(.module_path)"' 0.00s user 0.00s system 1% cpu 0.288 total
# Legacy format still supported for backward compatibility
codanna mcp find_symbol --args '{"name": "main"}'All MCP tools support --json flag for structured output, making integration with other tools seamless.
Configure Codanna in .codanna/settings.toml:
[semantic_search]
enabled = true
model = "AllMiniLML6V2"
threshold = 0.6 # Similarity threshold (0-1)
[indexing]
parallel_threads = 16 # Auto-detected by default
include_tests = true # Index test filesCodanna respects .gitignore and adds its own .codannaignore:
# Created automatically by codanna init
.codanna/ # Don't index own data
target/ # Skip build artifacts
node_modules/ # Skip dependencies
*_test.rs # Optionally skip testsSemantic search works by understanding your documentation comments:
/// Parse configuration from a TOML file and validate required fields
/// This handles missing files gracefully and provides helpful error messages
fn load_config(path: &Path) -> Result<Config, Error> {
// implementation...
}With good comments, semantic search can find this function when prompted for:
- "configuration validation"
- "handle missing config files"
- "TOML parsing with error handling"
This encourages better documentation → better AI understanding → more motivation to document.
| Command | Description | Example |
|---|---|---|
codanna init |
Set up .codanna directory with default configuration | codanna init --force |
codanna index <PATH> |
Build searchable index from your codebase | codanna index src --progress |
codanna config |
Display active settings | codanna config |
codanna serve |
Start MCP server for AI assistants | codanna serve --watch |
All retrieve commands support --json flag for structured output (exit code 3 when not found).
| Command | Description | Example |
|---|---|---|
retrieve symbol <NAME> |
Find a symbol by name | codanna retrieve symbol main --json |
retrieve calls <FUNCTION> |
Show what functions a given function calls | codanna retrieve calls parse_file --json |
retrieve callers <FUNCTION> |
Show what functions call a given function | codanna retrieve callers main --json |
retrieve implementations <TRAIT> |
Show what types implement a trait | codanna retrieve implementations Parser --json |
retrieve impact <SYMBOL> |
Show the impact radius of changing a symbol | codanna retrieve impact main --depth 3 --json |
retrieve search <QUERY> |
Search for symbols using full-text search | codanna retrieve search "parse" --limit 5 --json |
retrieve describe <SYMBOL> |
Show comprehensive information about a symbol | codanna retrieve describe SimpleIndexer --json |
| Command | Description | Example |
|---|---|---|
codanna mcp-test |
Verify Claude can connect and list available tools | codanna mcp-test |
codanna mcp <TOOL> |
Execute MCP tools without spawning server | codanna mcp find_symbol main --json |
codanna benchmark |
Benchmark parser performance | codanna benchmark rust --file my_code.rs |
--config,-c: Path to custom settings.toml file--force,-f: Force operation (overwrite, re-index, etc.)--progress,-p: Show progress during operations--threads,-t: Number of threads to use--dry-run: Show what would happen without executing
Available tools when using the MCP server. All tools support --json flag for structured output.
| Tool | Description | Example |
|---|---|---|
find_symbol |
Find a symbol by exact name | codanna mcp find_symbol main --json |
get_calls |
Show functions called by a given function | codanna mcp get_calls process_file |
find_callers |
Show functions that call a given function | codanna mcp find_callers init |
analyze_impact |
Analyze the impact radius of symbol changes | codanna mcp analyze_impact Parser --json |
get_index_info |
Get index statistics and metadata | codanna mcp get_index_info --json |
| Tool | Description | Example |
|---|---|---|
search_symbols |
Search symbols with full-text fuzzy matching | codanna mcp search_symbols query:parse kind:function limit:10 |
semantic_search_docs |
Search using natural language queries | codanna mcp semantic_search_docs query:"error handling" limit:5 |
semantic_search_with_context |
Search with enhanced context | codanna mcp semantic_search_with_context query:"parse files" threshold:0.7 |
| Tool | Parameters |
|---|---|
find_symbol |
name (required) |
search_symbols |
query, limit, kind, module |
semantic_search_docs |
query, limit, threshold |
semantic_search_with_context |
query, limit, threshold |
get_calls |
function_name |
find_callers |
function_name |
analyze_impact |
symbol_name, max_depth |
get_index_info |
None |
Parser benchmarks on a 750-symbol test file:
| Language | Parsing Speed | vs. Target (10k/s) | Status |
|---|---|---|---|
| Rust | 91,318 symbols/sec | 9.1x faster ✓ | Production |
| Python | 75,047 symbols/sec | 7.5x faster ✓ | Production |
| PHP | 68,432 symbols/sec | 6.8x faster ✓ | Production |
| JavaScript | - | - | v0.4.1 |
| TypeScript | - | - | v0.4.1 |
Key achievements:
- Zero-cost abstractions: All parsers use borrowed string slices with no allocations in hot paths
- Parallel processing: Multi-threaded indexing that scales with CPU cores
- Memory efficiency: Approximately 100 bytes per symbol including all metadata
- Real-time capability: Fast enough for incremental parsing during editing
- Optimized CLI startup: ~300ms for all operations (53x improvement from v0.2)
- JSON output: Zero overhead - structured output adds <1ms to response time
Run performance benchmarks:
codanna benchmark all # Test all parsers
codanna benchmark python # Test specific languageMemory-mapped storage: Two caches for different access patterns:
symbol_cache.bin- FNV-1a hashed symbol lookups, <10ms response timesegment_0.vec- 384-dimensional vectors, <1μs access after OS page cache warm-up
Embedding lifecycle management: Old embeddings deleted when files are re-indexed to prevent accumulation.
Lock-free concurrency: DashMap for concurrent symbol reads, write coordination via single writer lock.
Single-pass indexing: Symbols, relationships, and embeddings extracted in one AST traversal.
Hot reload: File watcher with 500ms debounce triggers re-indexing of changed files only.
- Rust 1.75+ (for development)
- ~150MB for model storage (downloaded on first use)
- A few MB for index storage (varies by codebase size)
- Supports Rust, Python, and PHP (JavaScript/TypeScript coming in v0.4.1)
- Semantic search requires English documentation/comments
- Windows support is experimental
- 0.3.x - CLI improvements and API stability
- 0.4.x - Language expansion via modular architecture
- 0.5.x - Enterprise features and advanced analysis
| Feature | Description | Status |
|---|---|---|
| JSON Output Support | Structured output for all commands | ✓ |
| Exit Codes | Semantic exit codes for scripting | ✓ |
| Unix-Friendly CLI | Positional args and key:value syntax | ✓ |
| Incremental Index Updates | File watching with auto re-indexing | ✓ |
| Feature | Description | Status |
|---|---|---|
| Language Registry Architecture | Modular parser system for easy language additions | ✓ |
| PHP Support | Full PHP parser implementation | ✓ |
| Enhanced Symbol Extraction | Comprehensive symbol extraction for all parsers | ✓ |
| Feature | Description | Status |
|---|---|---|
| JavaScript Support | Full JavaScript/ES6+ parser | ○ |
| TypeScript Support | TypeScript with type annotations | ○ |
| Feature | Description | Status |
|---|---|---|
| Go Support | Go language with interfaces and goroutines | ○ |
| Feature | Description | Status |
|---|---|---|
| C# Support | C# with .NET ecosystem support | ○ |
| Feature | Description | Status |
|---|---|---|
| Java Support | Java with class hierarchies | ○ |
| Feature | Description | Status |
|---|---|---|
| C/C++ Support | C and C++ with headers and templates | ○ |
| Feature | Description | Status |
|---|---|---|
| Direct Semantic Search | retrieve semantic command |
○ |
| Batch Operations | Process multiple symbols in one call | ○ |
| Output Format Control | Compact/full/json output modes | ○ |
| Query Language | Advanced search with complex filters | ○ |
| Configuration Profiles | Environment-specific settings | ○ |
| Machine-Readable Progress | JSON progress output | ○ |
| Cross-Language References | Track references across languages | ○ |
| Language Server Protocol | LSP integration for IDEs | ○ |
Legend: ✓ Complete | → In Progress | ○ Planned
- Rust - Full support with traits, generics, enums, type aliases, constants, and statics
- Python - Functions, classes, module-level variables, constants, and lambda functions
- PHP - Classes, functions, namespaces, traits, global constants, and variables
Based on developer demand and tree-sitter support:
- JavaScript/TypeScript (v0.4.1) - Most requested for web development
- Go (v0.4.2) - Growing popularity in cloud/backend
- C# (v0.4.3) - Enterprise and game development
- Java (v0.4.4) - Enterprise applications
- C/C++ (v0.4.5) - Systems programming
All retrieve commands and MCP tools support --json flag for structured output with consistent format and proper exit codes (v0.3.0).
Semantic exit codes for scripting: 0 (success), 1 (general error), 3 (not found). Enables reliable automation (v0.3.0).
Simplified syntax with positional arguments for simple tools and key:value pairs for complex tools. No JSON escaping needed (v0.3.0).
Watch mode with automatic re-indexing of changed files. Broadcast channels coordinate updates with 500ms debouncing (v0.3.0).
Modular parser system where languages self-register via a registry. Enables easy addition of new languages without core code changes (v0.4.0).
Full PHP parser with classes, functions, namespaces, and traits. Supports PHP 5 through PHP 8 syntax (v0.4.0).
Expanded symbol extraction across all parsers. Rust now extracts enums, type aliases, constants, and statics. Python extracts module-level variables, constants by naming convention, and lambda functions. PHP extracts global constants (const and define) and global variables (v0.4.0).
Direct retrieve semantic command for natural language code search without going through MCP interface.
Process multiple symbols in a single command to reduce overhead and improve CI/CD performance.
Choose between compact (script-friendly), full (human-readable), and json output formats.
Full JavaScript/ES6+ parser with modules, classes, async/await, and JSX support.
TypeScript parser with full type annotation support, interfaces, and decorators.
Go language parser with interfaces, goroutines, channels, and struct methods.
C# parser with .NET ecosystem support, LINQ, async/await, and attributes.
Java parser with class hierarchies, interfaces, generics, and annotations.
C and C++ parsers with headers, templates, macros, and cross-compilation units.
Advanced search syntax with wildcards, boolean operators, and complex filters.
Environment-specific settings (dev, test, production) with profile inheritance.
JSON-formatted progress output for better CI/CD integration and monitoring.
Track and analyze references across different programming languages in polyglot codebases.
LSP implementation for IDE integration with real-time code intelligence.
This is an early release focused on core functionality. Contributions welcome! See CONTRIBUTING for guidelines.
Licensed under the Apache License, Version 2.0 - See LICENSE file for details.
Attribution required when using Codanna in your project. See NOTICE file.
Built with 🦀 by developers who wanted their AI assistants to actually understand their code.