Zero compaction. Infinite memory. For Claude Code and every LLM agent.
Your AI never forgets — no more "context compacted" interruptions.
One-line install via Claude Code:
/install-plugin github:soolaugust/0CompactMem
If you use Claude Code, you know this pain:
⚠️ Auto-compact: conversation is approaching context limit...
Every time this happens, your AI loses track of decisions, constraints, and hard-won context. You re-explain. It re-learns. Hours of accumulated understanding — gone in one compaction event.
And if you run multiple agents? They can't share what they've learned. Each one starts from zero.
This isn't a model limitation. It's a missing infrastructure layer.
0CompactMem gives your AI agents persistent, retrievable memory that lives outside the context window. When compaction happens, nothing is lost — because the important stuff was never only in the context window to begin with.
The result: zero effective compaction. Your AI retains every decision, constraint, and lesson across sessions, across compactions, across agents.
You speak
→ 0CompactMem retrieves relevant memories → injects into context
→ AI responds with full context
→ Session ends → decisions and insights auto-extracted → persisted
→ Compaction happens? No problem — memories survive outside the window
→ Next session starts → working set restored automatically
The whole pipeline runs inside Claude Code hooks. There is no manual memory management.
| What others see | What actually happens |
|---|---|
| "Context compacted" | Critical knowledge already persisted to memory store |
| New session starts | Working set auto-restored in <100ms |
| Multiple agents running | All share the same memory — no re-explanation |
| Constraint decided 3 weeks ago | Pinned in memory, guaranteed never evicted |
Zero compaction impact. Zero context loss. Zero re-explanation.
The secret sauce? We didn't invent new algorithms. We borrowed what the Linux kernel has been doing for 40 years:
| OS concept | 0CompactMem equivalent |
|---|---|
| RAM (working space) | Context window — what the AI sees right now |
| Disk (persistent storage) | Knowledge base — facts that survive across sessions |
| Demand paging | On-demand retrieval — fetch relevant memories at the right moment |
mlock |
Hard / soft pinning — guarantee a constraint is never evicted |
| kswapd watermarks | Capacity-aware eviction under pressure |
| CRIU checkpoint / restore | Session snapshots — pause and resume seamlessly |
| Process scheduling | Multi-agent coordination — many agents, one knowledge base |
| kworker thread pool | Async extraction — I/O off the critical path |
| 0CompactMem | mem0 | Letta (MemGPT) | Zep | |
|---|---|---|---|---|
| Design metaphor | OS memory subsystem | Vector store | Agent runtime | Temporal graph |
| Zero-compact guarantee | ✅ pinned memories survive | ❌ | ❌ | ❌ |
| Multi-agent shared | ✅ native, single store | ✅ | ✅ | |
| MCP-native | ✅ first-class | ❌ | ❌ | ❌ |
| Single-file deploy | ✅ SQLite, no service | ❌ needs server | ❌ needs server | ❌ needs server |
| Demand-paging retrieval | ✅ explicit | implicit | implicit | implicit |
| Eviction policy | ✅ kswapd + DAMON | TTL only | recency | recency + decay |
| Pin / mlock semantics | ✅ | ❌ | ❌ | ❌ |
TL;DR. If you're tired of context compaction wiping your AI's memory, and you want a solution that's
pip install, runs as a sidecar on a laptop, shares between several Claude Code / Cursor / custom agents, and never loses a pinned constraint — 0CompactMem is built for that.
| Metric | Value |
|---|---|
| Retrieval latency (P50, hot path) | ~0.1 ms (540x faster than the 54 ms subprocess baseline) |
| Recall@3 vs baseline | +147% |
| Cross-session recall | 94.2% |
| Token cost per call | ~44 tokens injected, +256 tokens net ROI (avoided re-explanation) |
| Test suite | 3,500+ tests across retrieval, eviction, MCP, privacy filter |
One-line install (recommended).
/install-plugin github:soolaugust/0CompactMem
Manual install.
git clone https://github.com/soolaugust/0CompactMem
cd 0CompactMem
pip install -e .
mkdir -p ~/.claude/memory-osDetailed Claude Code hook configuration, daemon management, and troubleshooting live in docs/SETUP.md.
Three layers:
- Hooks — sit at the Claude Code syscall boundary (
SessionStart,UserPromptSubmit,Stop,PostToolUse) and call into the store. - Store — single SQLite file (WAL mode) with FTS5 full-text index, behind a unified VFS interface (
store.py/store_vfs.py/store_criu.py). - Daemons & IPC — persistent retriever daemon (Unix socket), async extractor pool (kworker-style), cross-agent notify bus.
For the full layered diagram, on-disk schema, and the rationale behind each subsystem, see docs/ARCHITECTURE.md. For the comprehensive OS-and-cognitive-science primitive mapping, see docs/DESIGN_PHILOSOPHY.md.
- Distributed 0CompactMem — cgroup-style multi-agent quotas, network-replicated stores
- Adaptive watermarks — eviction tuning that follows observed agent behavior
- arXiv preprint — formal evaluation against mem0 / Letta / Zep
- Per-chunk embedding routing — different models for code vs prose
What landed already (1,051+ tuning iterations, eight major capability rounds) is summarized in CHANGELOG.md. Pain points it has resolved along the way are in docs/PROBLEMS_SOLVED.md.
# stable test subset
python3 -m pytest tests/test_agent_team.py tests/test_chaos.py -qCoverage: per-session DB isolation, concurrent-write safety, cross-agent IPC delivery, extractor-pool queue semantics, CRIU checkpoint validation, goals-progress idempotency.
No GPU. No external API. Everything runs locally.
| Dependency | Purpose |
|---|---|
| Python 3.12+ | Core runtime |
| SQLite (built-in) | Store + FTS5 full-text index |
nc, flock |
Daemon socket + single-instance startup |
Each subsystem hides behind a clean VFS interface, so components are testable in isolation. Issues, design proposals, and pull requests are welcome — see the Discussions tab for design questions, and please run the test subset above before submitting a PR.