-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Feature Description
Introduce semantic search pre-filtering (with reranking) as a complementary step to Code Mode.
This feature reduces large toolsets (1,000+ tools and enumerations) into a smaller, semantically relevant subset before Code Mode operates with LLM, further reducing LLM token use and simultaneously improving accuracy.
Motivation
- LLM tokens are more expensive compared to embedding tokens.
- Even state-of-the-art models with 1M+ context windows suffer from Context Rot when overloaded with too many tools, especially in multi-step tasks.
Proposed Solution
- Apply semantic search pre-filtering to narrow down thousands of tools to a few dozen and narrow down their enum parameters.
- Use reranking to refine the subset further, prioritizing the most relevant tools.
- Extend this approach to enumerations within tools, reducing large option sets.
- Pass the reduced set into Code Mode, which then optimizes token usage and reasoning clarity with LLM.
This creates a scalable, automated RAG pipeline capable of handling millions of tools without manual intervention.
Alternatives Considered
- Compressing tool definitions with LLMs (suboptimal, loses semantics, requires manual versioning).
- Manual curation of tool subsets (tedious, unscalable) is not suitable for MCP clients.
- Relying solely on Code Mode (keeps inefficiency with very large toolsets, prone to Context Rot).
Use Cases
- Reducing 1,000+ tools into a manageable subset for complex workflows.
- Narrowing large enumerations within tools before LLM reasoning.
- Automating tool selection in dynamic environments (e.g., MCP) without manual intervention.
Example Usage
# Command line example
mcp-use semantic-pre-filter --tools large_toolset.json --rerank --codemodeImplementation Details
- Integrate semantic search embeddings for pre-filtering.
- Use reranking algorithms to prioritize relevance.
- Treat the pipeline as a standard RAG workflow for scalability.
- Reference implementation ideas: Stanford A1 ReduceAndGenerate() in extra strategies
- Improve existing MCP-Use search_tools functionality to pre-filter before Code Mode, and also reduce the large list of enums in the tool parameter.
Suggested Models:
Breaking Changes
[ ] This feature would introduce breaking changes
[x] This feature is backwards compatible
Additional Context
Code Mode already reduces LLM token consumption.
Semantic pre-filtering further improves efficiency and accuracy by shifting work to cheaper embedding tokens.
Helps avoid Context Rot in multi-step tasks with large toolsets.
Complements Code Mode rather than replacing it. Suitable for MCP clients.