base llama models with custom thinking mode, tool calling and context monitoring.

A Python codebase for adding thinking mode and tools over the normal llama models, has context tracking and suppourts exo, can read edit or delete files based on thinking and instructions.

Tool Usage

👤 You: What files are in this directory?

🤖 AI is thinking...

🧠 THINKING PROCESS:
The user wants to know what files are in the current directory. I should use the list_dir tool to explore the workspace.

💡 FINAL RESPONSE:
Let me check what files are available in the current directory.

TOOL:list_dir(relative_workspace_path=".", explanation="Exploring the workspace to see available files")

Model Configuration

Edit llm_thinking_mode.py to configure your setup:

# API Configuration
API_URL = "http://127.0.0.1:52415/v1/chat/completions"
MODEL_NAME = "llama-3.2-3b"  # Your model name

# Context Limits (tokens)
CONTEXT_LIMITS = {
    "llama-3.2-1b": 8192,      # 8K context window
    "llama-3.2-3b": 32768,     # 32K context window  
    "llama-3.2-8b": 131072,    # 128K context window
}

# Thinking Mode Settings
thinking_config = {
    "temperature": 0.7,        # Creativity level
    "top_p": 0.9,             # Vocabulary breadth
    "max_tokens": 400,        # Response length
    "presence_penalty": 0.3,   # Exploration encouragement
    "frequency_penalty": 0.2,  # Repetition avoidance
}

Installation

Clone the repository:

git clone <repository-url>
cd <filename>

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:
```
pip install -r requirements.txt
```
Set up your local LLM API:
- Make sure you have exo or any other compaitable local or online LLM API running on http://127.0.0.1:52415
- Update the MODEL_NAME in llm_thinking_mode.py if needed

Features

🧠 Thinking Mode

Transparent Reasoning: The AI shows its step-by-step thought process before giving answers
Structured Responses: Clear separation between thinking process and final answer
Interactive Chat: Real-time conversation with thinking visualization

🔧 Tool System

File Operations: Read files with line range support
Directory Exploration: List directory contents with file sizes
Tool Parsing: Automatic extraction and execution of tool calls from AI responses
Extensible Architecture: Easy to add new tools

📊 Context Monitoring

Token Tracking: Real-time monitoring of context usage
Smart Warnings: Alerts when approaching context limits
Auto-Management: Suggestions for managing conversation history
Model-Specific Limits: Support for different model context windows

🎯 Multiple Modes

Interactive Chat: Real-time conversation mode
Batch Processing: Process multiple questions in sequence
Context Management: Clear history, show context stats

Usage

Interactive Chat Mode

python llm_thinking_mode.py

The assistant will start in interactive mode where you can:

Ask questions and see the AI's thinking process
Use commands like context, clear, help, quit
Watch real-time context usage monitoring

Available Commands

quit - Exit the chat
context - Show detailed context information
clear - Clear conversation history
help - Show available commands

Batch Processing Mode

You can also process multiple questions at once by modifying the script:

# Uncomment and modify the batch section in llm_thinking_mode.py
sample_questions = [
    "What is consciousness?",
    "How should AI be regulated?", 
    "What's the meaning of existence?"
]
run_batch_thinking(sample_questions)

Context File

Create a context.md file to provide additional context to the AI: (running 3.5 sonnet system instructions, update as needed) The necessary context that helps run tools is in the llm_thinking_mode.py

# Project Context

This is a development environment for...
[Add any relevant context about your project]

File Structure

llmloops/
├── llm_thinking_mode.py    # Main interactive chat script
├── tools.py                # Tool implementation (read_file, list_dir)
├── test_parsing.py         # Tool parsing tests
├── test_tools.py           # Tool execution tests
├── context.md              # Optional context file
├── requirements.txt        # Python dependencies
├── README.md              # This file

Tools Available

read_file

Read file contents with line range support:

TOOL:read_file(target_file="example.py", should_read_entire_file=false, start_line_one_indexed=1, end_line_one_indexed_inclusive=20, explanation="Reading the first 20 lines")

list_dir

List directory contents with file sizes:

TOOL:list_dir(relative_workspace_path=".", explanation="Exploring the current directory")

Testing

Run the test scripts to verify functionality:

# Test tool parsing
python test_parsing.py

# Test tool execution
python test_tools.py

Context Monitoring

The system provides real-time context monitoring:

🟢 OK: Normal usage, plenty of context remaining
🟡 WARNING: Approaching context limit (80%+ used)
🔴 CRITICAL: Very close to context limit (90%+ used)

API Compatibility

The system is designed to work with OpenAI-compatible APIs. Make sure your local LLM API supports:

/v1/chat/completions endpoint
Standard message format with role and content
Streaming or non-streaming responses

Contributing

Fork the repository
Create a feature branch
Add tests for new functionality
Submit a pull request

License

This project is open source. Please check the license file for details.

Common Issues

Tool Parsing Issues: Check that tool calls follow the exact format shown in examples, check context.md

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
.gitignore		.gitignore
README.md		README.md
context.md		context.md
deepseek.py		deepseek.py
llm_duo_chat.py		llm_duo_chat.py
llm_thinking_mode.py		llm_thinking_mode.py
requirements.txt		requirements.txt
test_file.txt		test_file.txt
test_parsing.py		test_parsing.py
test_tools.py		test_tools.py
tools.py		tools.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

base llama models with custom thinking mode, tool calling and context monitoring.

Tool Usage

Model Configuration

Installation

Features

🧠 Thinking Mode

🔧 Tool System

📊 Context Monitoring

🎯 Multiple Modes

Usage

Interactive Chat Mode

Available Commands

Batch Processing Mode

Context File

File Structure

Tools Available

read_file

list_dir

Testing

Context Monitoring

API Compatibility

Contributing

License

Common Issues

About

Uh oh!

Releases

Packages

Languages

FairTraxx/llama-tools-and-thinking-mode

Folders and files

Latest commit

History

Repository files navigation

base llama models with custom thinking mode, tool calling and context monitoring.

Tool Usage

Model Configuration

Installation

Features

🧠 Thinking Mode

🔧 Tool System

📊 Context Monitoring

🎯 Multiple Modes

Usage

Interactive Chat Mode

Available Commands

Batch Processing Mode

Context File

File Structure

Tools Available

read_file

list_dir

Testing

Context Monitoring

API Compatibility

Contributing

License

Common Issues

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages