Self-evolving memory OS for LLM & AI Agents: ultra-persistent memory, hybrid-retrieval, and cross-task skill reuse, with 35.24% token savings
-
Updated
May 22, 2026 - TypeScript
Self-evolving memory OS for LLM & AI Agents: ultra-persistent memory, hybrid-retrieval, and cross-task skill reuse, with 35.24% token savings
The leading, most token-efficient MCP server for GitHub source code exploration via tree-sitter AST parsing
Universal AI context generator. Saves thousands of tokens per conversation in Claude Code, Cursor, Copilot, Codex, and more.
Symbol Delta Ledger (SDL-MCP) gives coding agents the right code context, not your entire repo. It turns sprawling codebases into compact, high-signal context that saves tokens, speeds up workflows, and improves agent output.
Less is more. Make your agents smarter and faster. It’s not just about saving time; it’s about the feeling of not wasting it.
Save 94% on AI coding tokens. Index your codebase, agents search instead of reading files. Works with Claude Code, Codex, Copilot, Cursor, Gemini CLI. Local MCP server, free, open source.
MCP server that saves Claude Code tokens by delegating bounded tasks to local or cloud LLMs. Works with LM Studio, Ollama, vLLM, DeepSeek, Groq, Cerebras.
MCP server for Claude Code and Codex. One tool call replaces ~42 minutes of agent exploration
Local-first Model Context Protocol (MCP) memory layer for Codex CLI/Desktop, Claude Code, Gemini CLI, Qwen/DeepSeek/Ollama and agent workflows. SQLite + FTS5 compact context packs, token savings, read-only mode, no external memory server.
Guardian Agent and Token Savings for Claude Code
MCP server for Git with local Ollama — zero tokens for git operations
A reversible code minifier for AI. Save tokens by stripping code format in your prompt, then perfectly restore it in the responces.
TSCG — Deterministic tool-schema compiler for LLM agents. 50-72% token savings, 50 tools in 2.4ms. Phi-4 recovers from 0% to 90% accuracy. 459 tests, zero dependencies, MIT.
Turn any OpenAPI spec into a native CLI binary. No MCP, no bloat, no runtime dependencies, ONLY CLI.
Caveman output style for Claude Code: 40% fewer output tokens, always-on formatting
Project-agnostic dual-memory MCP CLI for Claude Code, Cursor, and OpenCode (Qdrant tuned hybrid retrieval + structural memory hooks)
Make GitHub Copilot responses terse across VS Code Chat, Copilot CLI, cloud agent, and code review. One command. 40-75% fewer response tokens, no correctness hit.
Persistent code graph for AI agents. Tree-sitter → SQLite → MCP. Agents store understanding once, skip re-reads forever. 51–1500x fewer tokens.
Save Claude tokens by offloading tasks to free LLM APIs - 19x more workflows with the same token budget
Unified encode/decode payload manager with AI semantic compression. Four modes: xz, gz, native AI (awk, no API, ~25-40%), LLM API (Claude/OpenAI/Ollama, ~5-10%).
Add a description, image, and links to the token-savings topic page so that developers can more easily learn about it.
To associate your repository with the token-savings topic, visit your repo's landing page and select "manage topics."