Thanks to visit codestin.com
Credit goes to github.com

Skip to content

MechRosey/SembleSharp

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

115 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

semble logo
Fast and Accurate Code Search for Agents
Uses ~98% fewer tokens than grep+read

This is the .NET port of MinishLab/semble. All source in dotnet/ is pure C# targeting net8.0. The Python package benchmarks still apply -- the algorithms are identical.

Semble is a code search library built for agents. It returns the exact code snippets they need instantly, using ~98% fewer tokens than grep+read and cutting latency on every step. Indexing and searching a full codebase end-to-end takes under a second, with ~200x faster indexing and ~10x faster queries than a code-specialized transformer, at 99% of its retrieval quality (see benchmarks). Everything runs on CPU with no API keys, GPU, or external services. Run it as an MCP server and any agent (Claude Code, Cursor, Codex, OpenCode, etc.) gets instant access to any repo, cloned and indexed on demand.

Quickstart

Prerequisites

  1. .NET 8 SDK
  2. The default embedding model (downloaded once):
dotnet run --project dotnet/src/Semble.Cli -- download-model

Search a local repo

dotnet run --project dotnet/src/Semble.Cli -- search "authentication flow" ./my-project
dotnet run --project dotnet/src/Semble.Cli -- search "save_pretrained" ./my-project

Publish as a standalone binary

dotnet publish dotnet/src/Semble.Cli -c Release -r linux-x64 --self-contained

Main Features

  • Fast: indexes a repo in ~250 ms and answers queries in ~1.5 ms, all on CPU.
  • Accurate: NDCG@10 of 0.854 on our benchmarks, on par with code-specialized transformer models, at a fraction of the size and cost.
  • Token-efficient: returns only the relevant chunks, using ~98% fewer tokens than grep+read.
  • Zero setup: runs on CPU with no API keys, GPU, or external services required.
  • MCP server: drop-in tool for Claude Code, Cursor, Codex, OpenCode, and any other MCP-compatible agent.
  • Local and remote: pass a local path or a git URL.

MCP Server

Semble runs as an MCP server so agents can search any codebase. Repos are cloned and indexed on demand; indexes are cached for the lifetime of the session.

Setup

Claude Code

claude mcp add semble -- dotnet run --project /path/to/dotnet/src/Semble.Mcp

Or with a published binary:

claude mcp add semble -- /path/to/semble-mcp

Any MCP client (JSON config)

{
  "mcpServers": {
    "semble": {
      "command": "dotnet",
      "args": ["run", "--project", "/path/to/dotnet/src/Semble.Mcp"]
    }
  }
}

Tools

Tool Description
search Search a codebase with a natural-language or code query. Pass repo as a git URL or local path.
find_related Given a file path and line number, return chunks semantically similar to the code at that location.

Sub-agent support

Claude Code sub-agents cannot call MCP tools directly. Use the CLI via Bash instead:

semble search "authentication flow" ./my-project

Or run semble init once in the project root to install a .claude/agents/semble-search.md agent definition.

CLI

# Search a local repo
semble search "authentication flow" ./my-project

# Search for a symbol or identifier
semble search "save_pretrained" ./my-project

# Search a remote repo (cloned on demand)
semble search "save model to disk" https://github.com/MinishLab/model2vec

# Find code similar to a known location
semble find-related src/auth.cs 42 ./my-project

# Show indexed file count, chunks, and total token budget
semble savings ./my-project

# Initialise sub-agent helper (writes .claude/agents/semble-search.md)
semble init

How it works

Semble splits each file into code-aware chunks (Roslyn for C#, tree-sitter for C++, heading-aware for Markdown, page-aware for PDF/DOCX), then scores every query with two complementary retrievers: static Model2Vec embeddings using the code-specialized potion-code-16M model for semantic similarity, and BM25 for lexical matches on identifiers and API names. The two score lists are fused with Reciprocal Rank Fusion (RRF).

After fusing, results are reranked with a set of code-aware signals:

Ranking signals
  • Adaptive weighting. Symbol-like queries (Foo::bar, _private, getUserById) get more lexical weight, while natural-language queries stay balanced between semantic and lexical retrievers.
  • Definition boosts. A chunk that defines the queried symbol (a class, def, func, etc.) is ranked above chunks that merely reference it.
  • Identifier stems. Query tokens are stemmed and matched against identifier stems in a chunk, giving an additional weight to chunks that contain them. For example, querying parse config boosts chunks containing parseConfig, ConfigParser, or config_parser.
  • File coherence. When multiple chunks from the same file match the query, the file is boosted so the top result reflects broad file-level relevance rather than a single out-of-context chunk.
  • Noise penalties. Test files, compat//legacy/ shims, example code, and .d.ts declaration stubs are down-ranked so canonical implementations surface first.

Because the embedding model is static with no transformer forward pass at query time, all of this runs in milliseconds on CPU.

Benchmarks

Benchmarks were produced with the Python reference implementation. The .NET port uses identical algorithms (BM25 lucene method, same RRF k=60, same ranking constants).

We benchmark quality and speed across all methods on ~1,250 queries over 63 repositories in 19 languages. The x-axis is total latency (index + first query); the y-axis is NDCG@10. Marker size reflects model parameter count.

Speed vs quality

Method NDCG@10 Index time Query p50
CodeRankEmbed Hybrid 0.862 57 s 16 ms
semble 0.854 263 ms 1.5 ms
CodeRankEmbed 0.765 57 s 16 ms
ColGREP 0.693 5.8 s 124 ms
BM25 0.673 263 ms 0.02 ms
grepai 0.561 35 s 48 ms
probe 0.387 -- 207 ms
ripgrep 0.126 -- 12 ms

Semble achieves 99% of the performance of the 137M-parameter CodeRankEmbed Hybrid, while indexing 218x faster and answering queries 11x faster. See benchmarks for per-language results, ablations, and methodology.

Token efficiency

Agents using grep+read spend most of their context budget on irrelevant code. Semble returns only the chunks that match, keeping token usage low even at high recall.

Token efficiency: recall vs. retrieved tokens

Semble uses 98% fewer tokens on average, and reaches 94% recall at a budget of only 2k tokens, while grep+read needs a full 100k context window to reach 85%. See benchmarks for details.

License

MIT

Citing

If you use Semble in your research, please cite the following:

@software{minishlab2026semble,
  author       = {{van Dongen}, Thomas and Stephan Tulkens},
  title        = {Semble: Fast and Accurate Code Search for Agents},
  year         = {2026},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.19785932},
  url          = {https://github.com/MinishLab/semble},
  license      = {MIT}
}

About

Fast and Accurate Code Search for Agents. Uses ~98% fewer tokens than grep+read

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • C 97.5%
  • C# 1.8%
  • Other 0.7%