Thanks to visit codestin.com
Credit goes to elara-labs.github.io

MCP Server  ·  Local  ·  Zero Cloud

Index once.
Save 94%.

A local MCP server for AI coding agents. AST-aware indexing, semantic search, and automatic compression. Your agent stops re-reading your entire codebase every session.

$ uv tool install "code-context-engine[local]"
or: pipx install "code-context-engine[local]" · Ollama users: skip [local]
~/my-project
$ cce init   Code Context Engine · my-project ────────────────────────────────   Git hooks installed (3 hooks, auto-updates) MCP server registered in .mcp.json CLAUDE.md created .gitignore updated   Indexing project... ██████████████████████████████ 89/89 files 100%   Indexed 1,247 chunks from 89 files   Done! Restart your AI coding agent to activate CCE.
Works with your editor
AnthropicClaude Code CursorCursor VS CodeVS Code GeminiGemini CLI Codex CLI OpenCodeOpenCode TabnineTabnine

See it in action

Install, index, search, and see savings in under 30 seconds.

terminal
CCE Demo: install, index, search, and see token savings

Reproducible benchmark.
Run it yourself.

20 real coding questions against FastAPI (53 source files, 180K tokens). No synthetic queries, no cherry-picking.

Retrieval
94%
full files → relevant chunks
Compression
89%
chunks → signatures
Recall@10
0.90
found the right files
Latency p50
0.4ms
per search query
Token flow per query (avg)
Full files
83,681
After retrieval
4,927
After compression
523

Per-Layer Savings

Each layer measured against its own baseline. Not stacked.

Retrieval · measured 94%
Chunk Compression · measured 89%
Output Compression · estimated 65%
Grammar · measured 13%

Reproduce it

$ pip install "code-context-engine[local]"
$ python benchmarks/run_benchmark.py \
    --repo https://github.com/fastapi/fastapi.git \
    --source-dir fastapi

Full results: benchmarks/results/fastapi.md

Three commands.
Permanent savings.

01
🗂
Index your codebase

Tree-sitter parses your code into semantic chunks (functions, classes, modules). Stored locally with vector embeddings. Git hooks keep the index current automatically.

cce init
02
🔍
Agent searches, not reads

Instead of reading whole files, your agent calls context_search via MCP. Hybrid vector + keyword search finds the relevant chunks. Gets the 800 tokens it needs, not an 8,000-token dump.

context_search "payment"
03
📊
Track real savings

Every query is recorded. cce savings shows exactly how many tokens CCE saved you, with dollar estimates from live Anthropic pricing.

cce savings

Everything your agent needs.
Nothing it doesn't.

🌳
AST-Aware Chunking

Tree-sitter parses Python, JavaScript, TypeScript, PHP, Go, Rust, and Java into semantic chunks. Functions, classes, imports. No raw file dumps, no context waste.

Hybrid Search + Graph Expansion

Vector similarity + BM25 keyword search merged via Reciprocal Rank Fusion. Then CCE walks one hop on the code graph — if auth.py is a hit, utils.py it imports comes too.

🧠
Smart Compression

With Ollama running locally, chunks are summarized by phi3:mini. Without it, smart truncation extracts signatures and docstrings. Four output levels: off, lite, standard, max.

🔗
Session Memory

Recall decisions and code areas across sessions via session_recall, record_decision, and record_code_area. No re-explaining your architecture every session.

🛡
Security by Default

Secret files (.env, *.pem) are never indexed. Content is scanned for API keys and credentials. PII is scrubbed from memory writes. Path traversal protection on all inputs.

📊
Web Dashboard

Run cce dashboard for donut charts, bar graphs, file health, and live 5-second polling. Or use cce savings for a quick terminal summary.

How CCE stacks up

Feature No tool Caveman CCE (default) CCE + Ollama
Compress output tokens
Compress input tokens
Codebase indexing
Session memory
LLM summarization
Cost per session (Sonnet, medium project) $0.45 $0.26 $0.14 $0.09

Deep dives

May 15, 2026 · 4 min read
v0.4.20: Multi-Agent Targeting and Input/Output Savings Split
New --agent flag for cce init, accurate cost tracking with separate input and output token pricing, and upgrade detection fix.
May 5, 2026 · 8 min read
What is Code Context Engine? Complete Guide
What it does, how it works, how to set it up, all 9 tools explained. Everything a developer needs to know.
May 5, 2026 · 4 min read
How to Save Claude Code Tokens
Claude Code is expensive because of input tokens. Here's how to cut them by 94% with one command.
May 5, 2026 · 4 min read
CCE vs Cursor Built-in Indexing
Local vs cloud. One editor vs six. An honest comparison of tradeoffs.
May 5, 2026 · 5 min read
How to Reduce AI Coding Agent Costs
5 practical ways to cut your Claude Code, Cursor, and Copilot bills.
May 6, 2026 · 5 min read
How to Use CCE with Cursor: Complete Setup Guide
Step-by-step guide to setting up Code Context Engine with Cursor. Works alongside Cursor's built-in indexing.
May 4, 2026 · 5 min read
How We Cut Claude Code Token Usage by 94%
Reproducible benchmark on FastAPI. 20 real queries. Per-layer methodology. Limitations disclosed.

Stop paying to
re-read code.

One install. Automatic savings. Everything local.

$ uv tool install "code-context-engine[local]"
1 uv install
2 cce init
3 Restart editor
Saving tokens