Migration Summary: AnthropiMigration Goal: Transition from cloud-based AI to local, privacy-focused AI using efficient small models
Migration Scope: Complete migration of AI agent codebase from cloud-based Anthropic API to local Ollama implementation
Target Model: qwen3:4b - A compact, efficient model (4B parameters) designed for local execution
Key Benefits:
- π Local Execution: No data leaves your machine - complete privacy
- β‘ Small Model Efficiency: Fast inference with minimal resource requirements
- π° Cost-Free Operation: No API fees or usage limits
- π Offline Capability: Works without internet connection
- ποΈ Full Control: Host locally or on your private servers
Migration Date: October 5, 2025
π Server Implementation: For detailed documentation on the
--serverargument feature (enabling remote Ollama servers), see SERVER_IMPLEMENTATION.md
- π Overview
- π Reference Documentation
- π Files Modified
- π Types of Changes Applied
- π― Migration Benefits
- π§ Prerequisites After Migration
- π§ͺ Testing & Verification
- β Migration Status
- π Usage After Migrationmmary: Anthropic β Ollama
This document provides a comprehensive overview of the migration from Anthropic's Claude API to Ollama's local API using the qwen3:4b model.
- π Overview
- π Files Modified
- π Types of Changes Applied
- π― Migration Benefits
- π§ Prerequisites After Migration
- π§ͺ Testing & Verification
- β Migration Status
- π Usage After Migration
Migration Scope: Complete migration of AI agent codebase from cloud-based Anthropic API to local Ollama implementation
Target Model: qwen3:4b (consistently used across all files)
Migration Date: October 5, 2025
This migration was based on the official documentation for both APIs:
- Tool Use Documentation: https://docs.claude.com/en/docs/agents-and-tools/tool-use/overview
- API Format: Anthropic's proprietary tool calling convention with
input_schemaformat - Response Structure: Complex content blocks with
tool_useandtool_resulttypes
- Tool Support Documentation: https://ollama.com/blog/tool-support
- API Format: OpenAI-compatible tool calling convention with
parametersformat - Response Structure: Simplified message structure with
tool_callsarray
The migration involved converting between these two different tool calling conventions while maintaining identical functionality.
| File | Type of Changes | Status |
|---|---|---|
main.py |
Full migration - Dependencies, API calls, response handling, CLI args | β Complete |
README.MD |
Documentation updates - Installation instructions, usage examples | β Complete |
test_ollama_migration.py |
Created - Migration verification script | β New File |
| File | Type of Changes | Status |
|---|---|---|
runbook/01_basic_script.py |
Minimal - Comment updates only | β Complete |
runbook/02_agent_class.py |
Basic migration - Dependencies, initialization | β Complete |
runbook/03_define_tools.py |
Basic migration - Dependencies, initialization, tool setup | β Complete |
runbook/04_implement_tool_execution.py |
Medium migration - Dependencies, initialization, tool execution | β Complete |
runbook/05_add_chat_method.py |
Full migration - Dependencies, chat method, API calls, response handling | β Complete |
runbook/06_create_interactive_cli.py |
Full migration - Dependencies, chat method, CLI args, logging | β Complete |
runbook/07_add_personality.py |
Full migration - Dependencies, chat method, CLI args, system prompts | β Complete |
| File | Type of Changes | Status |
|---|---|---|
tests/test_ollama_migration.py |
Created - Basic migration functionality test | β New File |
tests/verify_runbook_migration.py |
Created - Automated verification script | β New File |
docs/MIGRATION_SUMMARY.md |
Created - This comprehensive migration documentation | β New File |
Files Affected: All Python files with dependencies
Before:
# /// script
# dependencies = [
# "anthropic",
# "pydantic",
# ]After:
# /// script
# dependencies = [
# "ollama",
# "pydantic",
# ]Files Affected: main.py, runbook/02-07_*.py
Before:
from anthropic import AnthropicAfter:
import ollamaFiles Affected: All files with AIAgent class
Before:
def __init__(self, api_key: str):
self.client = Anthropic(api_key=api_key)After:
def __init__(self, model: str = "qwen3:4b"):
self.model = modelFiles Affected: Files with chat functionality (main.py, runbook/05-07_*.py)
This was the most significant change, converting between two different tool calling conventions:
Before (Anthropic format - docs):
tool_schemas = [
{
"name": tool.name,
"description": tool.description,
"input_schema": tool.input_schema, # Anthropic uses 'input_schema'
}
for tool in self.tools
]After (Ollama/OpenAI format - docs):
ollama_tools = [
{
"type": "function", # OpenAI standard requires 'type': 'function'
"function": {
"name": tool.name,
"description": tool.description,
"parameters": tool.input_schema, # OpenAI uses 'parameters'
},
}
for tool in self.tools
]Files Affected: Files with chat functionality
Before (Anthropic API):
response = self.client.messages.create(
model="claude-sonnet-4-5-20250929",
max_tokens=4096,
system="You are a helpful assistant...",
messages=self.messages,
tools=tool_schemas,
)After (Ollama API):
messages_with_system = [
{
"role": "system",
"content": "You are a helpful assistant..."
}
] + self.messages
response = ollama.chat(
model=self.model,
messages=messages_with_system,
tools=ollama_tools,
)Files Affected: Files with chat functionality
Before (Complex content blocks):
assistant_message = {"role": "assistant", "content": []}
for content in response.content:
if content.type == "text":
assistant_message["content"].append({
"type": "text",
"text": content.text
})
elif content.type == "tool_use":
assistant_message["content"].append({
"type": "tool_use",
"id": content.id,
"name": content.name,
"input": content.input,
})After (Simple message structure):
message = response.get("message", {})
self.messages.append({
"role": "assistant",
"content": message.get("content", ""),
"tool_calls": message.get("tool_calls", [])
})Files Affected: Files with tool execution
Before (Anthropic format):
tool_results.append({
"type": "tool_result",
"tool_use_id": content.id,
"content": result,
})After (Ollama format):
tool_results.append({
"role": "tool",
"content": result,
"tool_call_id": tool_call.get("id", "")
})Files Affected: main.py, runbook/06-07_*.py
Before:
parser.add_argument(
"--api-key",
help="Anthropic API key (or set ANTHROPIC_API_KEY env var)"
)
api_key = args.api_key or os.environ.get("ANTHROPIC_API_KEY")
if not api_key:
print("Error: Please provide an API key")
sys.exit(1)
agent = AIAgent(api_key)After:
parser.add_argument(
"--model",
default="qwen3:4b",
help="Ollama model to use (default: qwen3:4b)"
)
agent = AIAgent(args.model)Files Affected: README.MD, all runbook files (comments)
Before:
# export ANTHROPIC_API_KEY="your-api-key-here"
# uv run main.pyAfter:
# ollama serve # Make sure Ollama is running
# ollama pull qwen3:4b # Pull the model if not already available
# uv run main.pyThis migration embraces the local-first AI movement - running compact, efficient models on your own hardware instead of relying on cloud services.
| Aspect | Before (Anthropic Cloud) | After (Ollama Local) |
|---|---|---|
| Privacy | Data sent to external service | 100% local - data never leaves your machine |
| Cost | Pay-per-use API charges | Completely free after setup |
| Internet | Required for every API call | Works completely offline |
| Performance | Network latency + processing | Local processing speed only |
| Rate Limits | API throttling and quotas | No limits - use as much as you want |
| Model Control | Limited to Anthropic models | Any Ollama model (qwen3:4b, llama, etc.) |
| Resource Usage | External cloud compute | Efficient small models on modest hardware |
| Availability | Dependent on API uptime | Always available - you control it |
- Small Size: Only 4 billion parameters - runs on laptops and modest servers
- High Efficiency: Excellent performance-to-size ratio
- Fast Inference: Quick responses without heavy hardware requirements
- Tool Support: Full function calling capabilities for agent workflows
- Local Privacy: Perfect for sensitive code and data processing
-
Install Ollama:
# Linux/macOS curl -fsSL https://ollama.com/install.sh | sh # Windows: Download from ollama.com
-
Start Ollama Service:
ollama serve
-
Pull Required Model:
ollama pull qwen3:4b
# Run verification script
python tests/verify_runbook_migration.py
# Test basic functionality
uv run tests/test_ollama_migration.py# Test main application
uv run main.py
# Test individual runbook files
uv run runbook/05_add_chat_method.py
uv run runbook/07_add_personality.py --model qwen3:4b- Total Files Modified: 11 files
- New Files Created: 3 files
- Migration Status: β Complete
- Server Argument: β Implemented with ollama.Client β Details
- Verification Status: β All tests passing
- Functionality Status: β Full feature parity maintained
- Tool Execution Logic: Remains identical (read_file, list_files, edit_file)
- File Operations: Work exactly the same
- Conversation Flow: Context management preserved
- Error Handling: Patterns maintained
- Logging: Enhanced with tool execution logging
- Interactive Experience: Fully preserved
# Start Ollama (one-time setup)
ollama serve
# Pull model (one-time setup)
ollama pull qwen3:4b
# Use main application
uv run main.py # Uses qwen3:4b by default
uv run main.py --model qwen3:4b # Explicit model selection
uv run main.py --server http://remote-host:11434 # Connect to remote Ollama server
# Use runbook examples
uv run runbook/07_add_personality.py # Full interactive experience
uv run runbook/06_create_interactive_cli.py --server http://remote:11434 # With remote server
uv run runbook/05_add_chat_method.py # Test chat functionalityThis migration represents more than just a technical change - it's a shift toward democratized, privacy-first AI:
- Data Sovereignty: Your code, your conversations, your intellectual property stays on your infrastructure
- Independence: No vendor lock-in, no API dependencies, no service outages affecting your workflow
- Accessibility: AI capabilities available to anyone with modest hardware, not just those who can afford API costs
- Efficiency Revolution: Modern small models like qwen3:4b deliver 80% of the capability at 10% of the resource cost
- Democratic Access: Powerful AI available on laptops, development machines, and small servers
- Sustainable AI: Lower energy consumption, reduced cloud dependency
- Zero Data Leakage: Code analysis, file processing, and conversations never leave your environment
- Regulatory Compliance: Perfect for regulated industries requiring data locality
- Trust: You control the AI, not the other way around
- No Recurring Costs: One-time setup vs. ongoing API charges
- Unlimited Usage: No rate limits or quotas restricting your productivity
- Predictable Costs: Hardware investment vs. unpredictable cloud bills
The migration maintains complete feature parity while providing the benefits of local execution, privacy, and cost savings. This is the future of AI development - powerful, private, and under your control.