Intelligent code retrieval engine — index, search, and compress any codebase with zero dependencies.
Mnemosyne indexes your codebase into a local SQLite store, scores every chunk with a 6-signal hybrid retriever, compresses results with AST awareness, and returns exactly what you need within a token or result budget. It runs entirely locally — no API keys, no cloud, no runtime dependencies beyond Python 3.11+.
pip install mnemosyne-enginemnemosyne init # create .mnemosyne/ workspace
mnemosyne ingest # index your codebase
mnemosyne query "How does authentication work?" # search| Metric | Result |
|---|---|
| Query latency | <20ms warm, <500ms cold |
| Token reduction | 73% on 829-file production repo |
| File retrieval accuracy | 100% across all test sets |
| Ingestion speed | 167 files/sec (~0.5s for 87 files) |
| Compression | 40-70% per chunk, AST-aware |
| Memory footprint | 10-30 MB total |
| Storage overhead | ~4.2 bytes per indexed token |
- Hybrid 6-signal search — BM25, TF-IDF, symbol matching, usage frequency, predictive prefetch, and optional dense embeddings fused via Reciprocal Rank Fusion
- Cost-model ranking — results ranked by value-per-token, not just relevance. Like a query optimizer for code retrieval
- AST-aware compression — four-stage pipeline preserves signatures, docstrings, and control flow while collapsing boilerplate (20-60% reduction)
- Self-tuning ARC cache — adapts between recency and frequency patterns automatically, persisted across sessions
- Delta-aware tracking — detects file and chunk-level changes, delivers diffs instead of full content (80-95% savings on incremental queries)
- Content deduplication — SHA-256 addressed storage eliminates duplicate chunks across files
- 7-language structural chunking — Python (AST), JavaScript/TypeScript, Go, C#, Rust, Java, Kotlin, plus Markdown and plain text
- Daemon mode — JSON-RPC over Unix socket keeps indexes warm for sub-20ms queries
- Full audit trail — append-only JSON-lines log of every operation
- Zero runtime dependencies — pure Python 3.11+ stdlib. One
pip install, no conflicts
Code search and navigation — Natural language queries return ranked, deduplicated results with function-level precision. Symbol-aware search finds implementations directly, not just string matches.
LLM context optimization — Feed Claude, GPT, Cursor, or any LLM agent the right tokens from a 100K+ codebase. Drop-in integration via instruction files cuts API spend 70%+ on context-heavy workflows.
Developer onboarding — New team members query "how does X work?" and get ranked results spanning models, middleware, and routes — complete function signatures with context, not random line hits.
PR review and CI/CD — Delta tracking identifies which functions changed and pulls their callers and tests into a review bundle. Pipe query output into automated review pipelines.
Legacy codebase archaeology — Before a rewrite or migration, index a large monolith to answer "what calls this table?" or "which modules depend on this API?" Hybrid search beats grep for cross-cutting queries.
Security audit surface mapping — Query for patterns like exec(, eval(, subprocess.call with usage-frequency ranking to prioritize the most-called dangerous patterns. Audit log provides evidence trail for compliance.
Incident response — On-call engineer searches "payment timeout retry" at 3am. Gets ranked, compressed results across the codebase instead of grepping blindly.
Migration impact analysis — Planning a framework upgrade or library swap? Query every usage of the old API, ranked by call frequency, to estimate effort and prioritize high-traffic paths.
Add to your CLAUDE.md, .cursorrules, or equivalent instruction file:
Before answering questions about this codebase, run:
! mnemosyne query "<question>" --budget 8000
Use the returned chunks as primary context. Only read additional files if needed.
Works with Claude Code, Cursor, Aider, Copilot, and any agent that can run shell commands.
| Command | Purpose |
|---|---|
init |
Create workspace and config |
ingest |
Index files (incremental, --full to rebuild) |
query |
Search with token budget |
stats |
Index and cache statistics |
compress |
Preview compression for a file |
delta |
Show changes since last index |
cache |
Manage ARC cache (show, clear, warm) |
daemon |
Persistent server for warm-start queries |
analytics |
Precision metrics and usage patterns |
audit |
Operation log |
health |
Index integrity checks |
gc |
Garbage collect stale data |
benchmark |
Run precision benchmarks |
| Document | Contents |
|---|---|
| REFERENCE.md | Full CLI reference, configuration, architecture, integration guides |
| ALGORITHMS.md | Algorithm details with academic paper references |
| TUNING.md | Precision tuning guide |
| CHANGELOG.md | Version history |
Dual-licensed: AGPL-3.0 for open-source use | Commercial license for proprietary embedding.
Copyright 2026 Cast Rock Innovation L.L.C. (DBA: Cast Net Technology)
