MCP Server · Local · Zero Cloud

Index once.
Save 94%.

A local MCP server for AI coding agents. AST-aware indexing, semantic search, and automatic compression. Your agent stops re-reading your entire codebase every session.

$ uv tool install "code-context-engine[local]"

or: pipx install "code-context-engine[local]" · Ollama users: skip [local]

~/my-project

$ cce init Code Context Engine · my-project ──────────────────────────────── ✓ Git hooks installed (3 hooks, auto-updates) ✓ MCP server registered in .mcp.json ✓ CLAUDE.md created ✓ .gitignore updated Indexing project... ██████████████████████████████ 89/89 files 100% ✓ Indexed 1,247 chunks from 89 files Done! Restart your AI coding agent to activate CCE.

Benchmark

Reproducible benchmark.
Run it yourself.

20 real coding questions against FastAPI (53 source files, 180K tokens). No synthetic queries, no cherry-picking.

Retrieval

94%

full files → relevant chunks

Compression

89%

chunks → signatures

Recall@10

0.90

found the right files

Latency p50

0.4ms

per search query

Token flow per query (avg)

Full files

83,681

After retrieval

4,927

After compression

523

Per-Layer Savings

Each layer measured against its own baseline. Not stacked.

Retrieval · measured 94%

Chunk Compression · measured 89%

Output Compression · estimated 65%

Grammar · measured 13%

Reproduce it

$ pip install "code-context-engine[local]"
$ python benchmarks/run_benchmark.py \
--repo https://github.com/fastapi/fastapi.git \
--source-dir fastapi

Full results: benchmarks/results/fastapi.md

How it works

Three commands.
Permanent savings.

🗂

Index your codebase

Tree-sitter parses your code into semantic chunks (functions, classes, modules). Stored locally with vector embeddings. Git hooks keep the index current automatically.

cce init

🔍

Agent searches, not reads

Instead of reading whole files, your agent calls context_search via MCP. Hybrid vector + keyword search finds the relevant chunks. Gets the 800 tokens it needs, not an 8,000-token dump.

context_search "payment"

📊

Track real savings

Every query is recorded. cce savings shows exactly how many tokens CCE saved you, with dollar estimates from live Anthropic pricing.

cce savings

Features

Everything your agent needs.
Nothing it doesn't.

🌳

AST-Aware Chunking

Tree-sitter parses Python, JavaScript, TypeScript, PHP, Go, Rust, and Java into semantic chunks. Functions, classes, imports. No raw file dumps, no context waste.

⚡

Hybrid Search + Graph Expansion

Vector similarity + BM25 keyword search merged via Reciprocal Rank Fusion. Then CCE walks one hop on the code graph — if auth.py is a hit, utils.py it imports comes too.

🧠

Smart Compression

With Ollama running locally, chunks are summarized by phi3:mini. Without it, smart truncation extracts signatures and docstrings. Four output levels: off, lite, standard, max.

🔗

Session Memory

Recall decisions and code areas across sessions via session_recall, record_decision, and record_code_area. No re-explaining your architecture every session.

🛡

Security by Default

Secret files (.env, *.pem) are never indexed. Content is scanned for API keys and credentials. PII is scrubbed from memory writes. Path traversal protection on all inputs.

📊

Web Dashboard

Run cce dashboard for donut charts, bar graphs, file health, and live 5-second polling. Or use cce savings for a quick terminal summary.

Feature	No tool	Caveman	CCE (default)	CCE + Ollama
Compress output tokens	✗	✓	✓	✓
Compress input tokens	✗	✗	✓	✓
Codebase indexing	✗	✗	✓	✓
Session memory	✗	✗	✓	✓
LLM summarization	✗	✗	✗	✓
Cost per session (Sonnet, medium project)	$0.45	$0.26	$0.14	$0.09

Index once.
Save 94%.

See it in action

Reproducible benchmark.
Run it yourself.

Per-Layer Savings

Reproduce it

Three commands.
Permanent savings.

Everything your agent needs.
Nothing it doesn't.

How CCE stacks up

Deep dives

Stop paying to
re-read code.

Index once.Save 94%.

See it in action

Reproducible benchmark.Run it yourself.

Per-Layer Savings

Reproduce it

Three commands.Permanent savings.

Everything your agent needs.Nothing it doesn't.

How CCE stacks up

Deep dives

Stop paying tore-read code.

Index once.
Save 94%.

Reproducible benchmark.
Run it yourself.

Three commands.
Permanent savings.

Everything your agent needs.
Nothing it doesn't.

Stop paying to
re-read code.