A Python codebase for adding thinking mode and tools over the normal llama models, has context tracking and suppourts exo, can read edit or delete files based on thinking and instructions.
👤 You: What files are in this directory?
🤖 AI is thinking...
🧠 THINKING PROCESS:
The user wants to know what files are in the current directory. I should use the list_dir tool to explore the workspace.
💡 FINAL RESPONSE:
Let me check what files are available in the current directory.
TOOL:list_dir(relative_workspace_path=".", explanation="Exploring the workspace to see available files")
Edit llm_thinking_mode.py to configure your setup:
# API Configuration
API_URL = "http://127.0.0.1:52415/v1/chat/completions"
MODEL_NAME = "llama-3.2-3b" # Your model name
# Context Limits (tokens)
CONTEXT_LIMITS = {
"llama-3.2-1b": 8192, # 8K context window
"llama-3.2-3b": 32768, # 32K context window
"llama-3.2-8b": 131072, # 128K context window
}
# Thinking Mode Settings
thinking_config = {
"temperature": 0.7, # Creativity level
"top_p": 0.9, # Vocabulary breadth
"max_tokens": 400, # Response length
"presence_penalty": 0.3, # Exploration encouragement
"frequency_penalty": 0.2, # Repetition avoidance
}-
Clone the repository:
git clone <repository-url> cd <filename>
-
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Set up your local LLM API:
- Make sure you have exo or any other compaitable local or online LLM API running on
http://127.0.0.1:52415 - Update the
MODEL_NAMEinllm_thinking_mode.pyif needed
- Make sure you have exo or any other compaitable local or online LLM API running on
- Transparent Reasoning: The AI shows its step-by-step thought process before giving answers
- Structured Responses: Clear separation between thinking process and final answer
- Interactive Chat: Real-time conversation with thinking visualization
- File Operations: Read files with line range support
- Directory Exploration: List directory contents with file sizes
- Tool Parsing: Automatic extraction and execution of tool calls from AI responses
- Extensible Architecture: Easy to add new tools
- Token Tracking: Real-time monitoring of context usage
- Smart Warnings: Alerts when approaching context limits
- Auto-Management: Suggestions for managing conversation history
- Model-Specific Limits: Support for different model context windows
- Interactive Chat: Real-time conversation mode
- Batch Processing: Process multiple questions in sequence
- Context Management: Clear history, show context stats
python llm_thinking_mode.pyThe assistant will start in interactive mode where you can:
- Ask questions and see the AI's thinking process
- Use commands like
context,clear,help,quit - Watch real-time context usage monitoring
quit- Exit the chatcontext- Show detailed context informationclear- Clear conversation historyhelp- Show available commands
You can also process multiple questions at once by modifying the script:
# Uncomment and modify the batch section in llm_thinking_mode.py
sample_questions = [
"What is consciousness?",
"How should AI be regulated?",
"What's the meaning of existence?"
]
run_batch_thinking(sample_questions)Create a context.md file to provide additional context to the AI: (running 3.5 sonnet system instructions, update as needed)
The necessary context that helps run tools is in the llm_thinking_mode.py
# Project Context
This is a development environment for...
[Add any relevant context about your project]llmloops/
├── llm_thinking_mode.py # Main interactive chat script
├── tools.py # Tool implementation (read_file, list_dir)
├── test_parsing.py # Tool parsing tests
├── test_tools.py # Tool execution tests
├── context.md # Optional context file
├── requirements.txt # Python dependencies
├── README.md # This file
Read file contents with line range support:
TOOL:read_file(target_file="example.py", should_read_entire_file=false, start_line_one_indexed=1, end_line_one_indexed_inclusive=20, explanation="Reading the first 20 lines")
List directory contents with file sizes:
TOOL:list_dir(relative_workspace_path=".", explanation="Exploring the current directory")
Run the test scripts to verify functionality:
# Test tool parsing
python test_parsing.py
# Test tool execution
python test_tools.pyThe system provides real-time context monitoring:
- 🟢 OK: Normal usage, plenty of context remaining
- 🟡 WARNING: Approaching context limit (80%+ used)
- 🔴 CRITICAL: Very close to context limit (90%+ used)
The system is designed to work with OpenAI-compatible APIs. Make sure your local LLM API supports:
/v1/chat/completionsendpoint- Standard message format with
roleandcontent - Streaming or non-streaming responses
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Submit a pull request
This project is open source. Please check the license file for details.
- Tool Parsing Issues: Check that tool calls follow the exact format shown in examples, check context.md