SAGE is an AI protocol that dynamically manages multi-LLM workflows by breaking down user prompts into validated, goal-driven sub-tasks processed by the most suitable language models.
- Decomposer Agent: Breaks main prompt into meaningful sub-prompts
- Router Agent: Selects the best model for each sub-task
- Execution Manager: Runs sub-prompts sequentially with context
- Evaluator: Uses an LLM-based approach to judge if each sub-task's response fulfills the sub-task, by prompting a selected LLM (default: Deepseek) to act as an expert evaluator. The evaluator model is configurable and can be changed in the config.
- Retry/Reassign Handler: Manages failed tasks
- Aggregator: Combines outputs into final response
flowchart TB
UserInput(["User Prompt Input"]) --> Decomposer["DecomposerAgent (Breaks down user prompt into sub-prompts)"]
Decomposer --> SubPromptList["List of Sub-Prompts"]
SubPromptList --> SubPromptLoop["Process Next Sub-Prompt"]
subgraph "For Each Sub-Prompt (Sequential Processing)"
SubPromptLoop --> Router["RouterAgent (Selects best model for sub-prompt)"]
Router --> RouterDecision{"Meta-router Success?"}
RouterDecision -->|Yes| Executor["ExecutionManager (Executes sub-prompt using assigned model)"]
RouterDecision -->|No| Fallback["Use fallback model from config"]
Fallback --> Executor
Executor --> Evaluator["Evaluator (Evaluates if result meets expected goal)"]
Evaluator --> EvalDecision{"Evaluation Successful?"}
EvalDecision -->|Yes| StoreResult["Store successful result"]
EvalDecision -->|No| RetryCheck{"Retry count < max_retries?"}
RetryCheck -->|Yes| SelectNewModel["Select new model (not previously tried)"]
SelectNewModel --> Executor
RetryCheck -->|No| StoreBestAttempt["Store best attempt (highest similarity score)"]
StoreResult --> MoreSubPrompts{"More sub-prompts?"}
StoreBestAttempt --> MoreSubPrompts
MoreSubPrompts -->|Yes| SubPromptLoop
end
MoreSubPrompts -->|No| Aggregator["Aggregator (Combines all sub-prompt results)"]
Aggregator --> FinalResponse(["Return Final Response and Metadata"])
-
Clone or download this repository:
git clone https://github.com/saim-x/SAGE cd SAGEOr download the ZIP from GitHub and extract it, then open a terminal in the extracted folder.
-
Install all requirements:
pip install -r requirements.txt
-
Run the protocol:
python src/test_sage_protocol.py
- gemma3:4b (Ollama local)
- deepseek-r1:1.5b (Ollama local)
- qwen3:1.7b (Ollama local)
- gemini-2.5-flash (Gemini, cloud)
Note: You can now use Gemini (cloud) as an LLM provider. See below for setup instructions.
To use Gemini as a cloud LLM provider, you must add your Gemini API key to a .env file in the project root:
GEMINI_API_KEY=your_gemini_api_key_here
The protocol will automatically detect and use Gemini for any sub-task assigned to a Gemini model (e.g., gemini-2.5-flash).
- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install development dependencies:
pip install -r requirements.txt- Run the protocol for testing or demonstration:
python src/test_sage_protocol.pyYou can also provide a custom prompt:
python src/test_sage_protocol.py --prompt "Your custom prompt here"Or show detailed output for each sub-prompt:
python src/test_sage_protocol.py --verboseNote: There are currently no traditional pytest-based tests in this project. All protocol testing and demonstration is performed via the CLI script above.
MIT License
SAGE is completely open source and we welcome contributions from the community!
How to contribute:
- Fork this repository to your own GitHub account.
- Clone your fork to your local machine.
- Create a new branch for your feature or bugfix:
git checkout -b my-feature
- Make your changes and commit them with clear messages.
- Push your branch to your fork:
git push origin my-feature
- Open a Pull Request on GitHub, describing your changes and why they should be merged.
Guidelines:
- Please open an issue if you want to discuss a bug or feature before submitting code.
- Try to follow the existing code style and structure.
- If possible, test your changes before submitting.
- Be respectful and constructive in all interactions.
We appreciate all kinds of contributions—bug reports, feature requests, documentation improvements, and code!
gemma3:4bdeepseek-r1:1.5bqwen3:1.7b
To use these models, ensure they are installed in your local Ollama instance:
ollama pull gemma3:4b
deepeek-r1:1.5b
ollama pull qwen3:1.7bEdit config/settings.yaml to set available models. Example:
available_models:
- "gemma3:4b"
- "deepseek-r1:1.5b"
- "qwen3:1.7b"src/— Main source code for the protocol and CLIsrc/sage/— Core protocol packagesrc/sage/agents/— All agent classes (decomposer, router, executor, evaluator, aggregator, base)src/sage/core/— Core models and utilitiessrc/test_sage_protocol.py— CLI runner and test entry pointconfig/— Configuration files (e.g.,settings.yaml)requirements.txt— Python dependenciessetup.py— Package setupSAGE.spec.yaml— Protocol specification (see below)sage_protocol.log— Log file for all runs (debugging/audit)README.md— This documentation
Run the protocol with the default test prompt:
python src/test_sage_protocol.pyRun with a custom prompt:
python src/test_sage_protocol.py --prompt "Your custom prompt here"Show detailed output for each sub-prompt:
python src/test_sage_protocol.py --verboseAll protocol runs are logged to sage_protocol.log in the project root. This file contains detailed step-by-step logs for debugging and audit trails.
SAGE is designed to be modular and extensible. You can:
- Add new agent types (e.g., for planning, validation, or post-processing) in
src/sage/agents/ - Integrate additional LLM providers or models
- Customize decomposition, routing, or evaluation logic
- Plug in custom similarity metrics or feedback mechanisms
The protocol is formally described in SAGE.spec.yaml. This file documents all components, workflow steps, configuration options, and extensibility points. Use it as a reference for implementation or extension.
If LLM-based evaluation is unavailable (e.g., Ollama is not running), SAGE will automatically use semantic similarity (cosine similarity of sentence embeddings) between the model's answer and the subprompt content to determine success/failure. This requires the sentence-transformers package. If it is not installed, SAGE will fallback to string similarity.
Dependency:
sentence-transformers>=2.2.2
You can tune the similarity threshold in your config (similarity_threshold).
SAGE supports interactive selection between local (Ollama) and cloud (Gemini) LLM providers at runtime. When running the CLI, you will be prompted to choose your provider type, and the protocol will dynamically filter available models and assignments based on your choice. This ensures only compatible models are used for your selected environment.
The CLI runner dynamically filters the configuration (models, assignments, parameters) based on your provider selection, ensuring that only relevant models and settings are used for each run.
SAGE features a detailed logging system. All protocol runs are logged to sage_protocol.log in the project root. Each agent (decomposer, router, executor, evaluator, aggregator) logs its actions, warnings, and errors, providing a comprehensive audit trail for debugging and transparency.
When a sub-task fails and is retried with a new model, SAGE may automatically adjust model parameters (such as increasing temperature or max tokens) to improve the chances of success. This adaptive retry logic is handled by the router agent.
Each sub-task (SubPrompt) tracks dependencies on previous sub-tasks, and the output of one sub-task is passed as context to the next. This enables sequential reasoning and context-aware execution across the workflow.
SAGE is designed to be extensible for new LLM providers. While the current implementation supports Ollama (local) and Gemini (cloud), the codebase is structured to allow easy integration of additional providers in the future. (Note: Only Ollama and Gemini are currently implemented.)
Edit config/settings.yaml to set available models. Example:
available_models:
- "gemma3:4b"
- "deepseek-r1:1.5b"
- "qwen3:1.7b"The following configuration options are supported in config/settings.yaml:
similarity_threshold: (float) Similarity threshold for evaluation (default: 0.9)max_retries: (int) Maximum number of retries for failed sub-tasks (default: 3)default_model: (string) Default model to use if routing failsavailable_models: (list) List of available model namesmodel_assignments: (dict) Mapping of task types to model namesmodel_parameters: (dict) Per-model parameter settingsevaluator_model: (string) Model used for LLM-based evaluationmodel_provider_map: (dict) Mapping of model names to provider types (local/cloud)logging: (dict) Logging configuration (level, format)
Note: The
retry_strategyoptions for backoff and delay are not currently implemented and have been removed from the configuration. Onlymax_retriesis used for retry logic.
SAGE is designed to be modular and extensible. You can:
- Add new agent types (e.g., for planning, validation, or post-processing) in
src/sage/agents/ - Integrate additional LLM providers or models (currently, only Ollama and Gemini are implemented)
- Customize decomposition, routing, or evaluation logic
- Plug in custom similarity metrics or feedback mechanisms
If you want to use SAGE as a Python library in your own scripts, you can do so after installing requirements:
from sage import SAGE
sage = SAGE()
result = sage.process_prompt("Your prompt here")Note: This is for advanced users. For most use cases, the CLI runner is recommended.
