Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Latest commit

Β 

History

History
429 lines (345 loc) Β· 14.4 KB

File metadata and controls

429 lines (345 loc) Β· 14.4 KB

Migration Summary: AnthropiMigration Goal: Transition from cloud-based AI to local, privacy-focused AI using efficient small models

Migration Scope: Complete migration of AI agent codebase from cloud-based Anthropic API to local Ollama implementation Target Model: qwen3:4b - A compact, efficient model (4B parameters) designed for local execution Key Benefits:

  • 🏠 Local Execution: No data leaves your machine - complete privacy
  • ⚑ Small Model Efficiency: Fast inference with minimal resource requirements
  • πŸ’° Cost-Free Operation: No API fees or usage limits
  • πŸ”Œ Offline Capability: Works without internet connection
  • πŸŽ›οΈ Full Control: Host locally or on your private servers

Migration Date: October 5, 2025

πŸ“– Server Implementation: For detailed documentation on the --server argument feature (enabling remote Ollama servers), see SERVER_IMPLEMENTATION.md

οΏ½ Reference DocumentationOllama

πŸ“š Table of Contents

This document provides a comprehensive overview of the migration from Anthropic's Claude API to Ollama's local API using the qwen3:4b model.

οΏ½ Table of Contents

οΏ½πŸ“‹ Overview

Migration Scope: Complete migration of AI agent codebase from cloud-based Anthropic API to local Ollama implementation Target Model: qwen3:4b (consistently used across all files) Migration Date: October 5, 2025

οΏ½ Reference Documentation

This migration was based on the official documentation for both APIs:

Source API (Anthropic Claude)

Target API (Ollama)

  • Tool Support Documentation: https://ollama.com/blog/tool-support
  • API Format: OpenAI-compatible tool calling convention with parameters format
  • Response Structure: Simplified message structure with tool_calls array

The migration involved converting between these two different tool calling conventions while maintaining identical functionality.

οΏ½πŸ“ Files Modified

Core Application Files

File Type of Changes Status
main.py Full migration - Dependencies, API calls, response handling, CLI args βœ… Complete
README.MD Documentation updates - Installation instructions, usage examples βœ… Complete
test_ollama_migration.py Created - Migration verification script βœ… New File

Runbook Tutorial Files

File Type of Changes Status
runbook/01_basic_script.py Minimal - Comment updates only βœ… Complete
runbook/02_agent_class.py Basic migration - Dependencies, initialization βœ… Complete
runbook/03_define_tools.py Basic migration - Dependencies, initialization, tool setup βœ… Complete
runbook/04_implement_tool_execution.py Medium migration - Dependencies, initialization, tool execution βœ… Complete
runbook/05_add_chat_method.py Full migration - Dependencies, chat method, API calls, response handling βœ… Complete
runbook/06_create_interactive_cli.py Full migration - Dependencies, chat method, CLI args, logging βœ… Complete
runbook/07_add_personality.py Full migration - Dependencies, chat method, CLI args, system prompts βœ… Complete

Verification & Documentation Files

File Type of Changes Status
tests/test_ollama_migration.py Created - Basic migration functionality test βœ… New File
tests/verify_runbook_migration.py Created - Automated verification script βœ… New File
docs/MIGRATION_SUMMARY.md Created - This comprehensive migration documentation βœ… New File

πŸ”„ Types of Changes Applied

1. Dependency Changes

Files Affected: All Python files with dependencies

Before:

# /// script
# dependencies = [
#     "anthropic",
#     "pydantic",
# ]

After:

# /// script  
# dependencies = [
#     "ollama",
#     "pydantic",
# ]

2. Import Statements

Files Affected: main.py, runbook/02-07_*.py

Before:

from anthropic import Anthropic

After:

import ollama

3. Class Initialization

Files Affected: All files with AIAgent class

Before:

def __init__(self, api_key: str):
    self.client = Anthropic(api_key=api_key)

After:

def __init__(self, model: str = "qwen3:4b"):
    self.model = model

4. Tool Schema Format Conversion

Files Affected: Files with chat functionality (main.py, runbook/05-07_*.py)

This was the most significant change, converting between two different tool calling conventions:

Before (Anthropic format - docs):

tool_schemas = [
    {
        "name": tool.name,
        "description": tool.description,
        "input_schema": tool.input_schema,    # Anthropic uses 'input_schema'
    }
    for tool in self.tools
]

After (Ollama/OpenAI format - docs):

ollama_tools = [
    {
        "type": "function",                   # OpenAI standard requires 'type': 'function'
        "function": {
            "name": tool.name,
            "description": tool.description,
            "parameters": tool.input_schema,  # OpenAI uses 'parameters'
        },
    }
    for tool in self.tools
]

5. API Call Transformation

Files Affected: Files with chat functionality

Before (Anthropic API):

response = self.client.messages.create(
    model="claude-sonnet-4-5-20250929",
    max_tokens=4096,
    system="You are a helpful assistant...",
    messages=self.messages,
    tools=tool_schemas,
)

After (Ollama API):

messages_with_system = [
    {
        "role": "system",
        "content": "You are a helpful assistant..."
    }
] + self.messages

response = ollama.chat(
    model=self.model,
    messages=messages_with_system,
    tools=ollama_tools,
)

6. Response Handling Restructure

Files Affected: Files with chat functionality

Before (Complex content blocks):

assistant_message = {"role": "assistant", "content": []}

for content in response.content:
    if content.type == "text":
        assistant_message["content"].append({
            "type": "text", 
            "text": content.text
        })
    elif content.type == "tool_use":
        assistant_message["content"].append({
            "type": "tool_use",
            "id": content.id,
            "name": content.name,
            "input": content.input,
        })

After (Simple message structure):

message = response.get("message", {})

self.messages.append({
    "role": "assistant",
    "content": message.get("content", ""),
    "tool_calls": message.get("tool_calls", [])
})

7. Tool Result Format Changes

Files Affected: Files with tool execution

Before (Anthropic format):

tool_results.append({
    "type": "tool_result",
    "tool_use_id": content.id,
    "content": result,
})

After (Ollama format):

tool_results.append({
    "role": "tool",
    "content": result,
    "tool_call_id": tool_call.get("id", "")
})

8. Command Line Interface Updates

Files Affected: main.py, runbook/06-07_*.py

Before:

parser.add_argument(
    "--api-key", 
    help="Anthropic API key (or set ANTHROPIC_API_KEY env var)"
)

api_key = args.api_key or os.environ.get("ANTHROPIC_API_KEY")
if not api_key:
    print("Error: Please provide an API key")
    sys.exit(1)

agent = AIAgent(api_key)

After:

parser.add_argument(
    "--model", 
    default="qwen3:4b",
    help="Ollama model to use (default: qwen3:4b)"
)

agent = AIAgent(args.model)

9. Documentation Updates

Files Affected: README.MD, all runbook files (comments)

Before:

# export ANTHROPIC_API_KEY="your-api-key-here"
# uv run main.py

After:

# ollama serve  # Make sure Ollama is running
# ollama pull qwen3:4b  # Pull the model if not already available
# uv run main.py

🎯 Migration Benefits: Why Small Local Models?

🏠 Local-First AI Philosophy

This migration embraces the local-first AI movement - running compact, efficient models on your own hardware instead of relying on cloud services.

Aspect Before (Anthropic Cloud) After (Ollama Local)
Privacy Data sent to external service 100% local - data never leaves your machine
Cost Pay-per-use API charges Completely free after setup
Internet Required for every API call Works completely offline
Performance Network latency + processing Local processing speed only
Rate Limits API throttling and quotas No limits - use as much as you want
Model Control Limited to Anthropic models Any Ollama model (qwen3:4b, llama, etc.)
Resource Usage External cloud compute Efficient small models on modest hardware
Availability Dependent on API uptime Always available - you control it

⚑ Why qwen3:4b is Perfect for Local AI

  • Small Size: Only 4 billion parameters - runs on laptops and modest servers
  • High Efficiency: Excellent performance-to-size ratio
  • Fast Inference: Quick responses without heavy hardware requirements
  • Tool Support: Full function calling capabilities for agent workflows
  • Local Privacy: Perfect for sensitive code and data processing

πŸ”§ Prerequisites After Migration

  1. Install Ollama:

    # Linux/macOS
    curl -fsSL https://ollama.com/install.sh | sh
    
    # Windows: Download from ollama.com
  2. Start Ollama Service:

    ollama serve
  3. Pull Required Model:

    ollama pull qwen3:4b

πŸ§ͺ Testing & Verification

Automated Verification

# Run verification script
python tests/verify_runbook_migration.py

# Test basic functionality
uv run tests/test_ollama_migration.py

Manual Testing

# Test main application
uv run main.py

# Test individual runbook files
uv run runbook/05_add_chat_method.py
uv run runbook/07_add_personality.py --model qwen3:4b

βœ… Migration Status

  • Total Files Modified: 11 files
  • New Files Created: 3 files
  • Migration Status: βœ… Complete
  • Server Argument: βœ… Implemented with ollama.Client β†’ Details
  • Verification Status: βœ… All tests passing
  • Functionality Status: βœ… Full feature parity maintained

πŸ“ Compatibility Notes

  • Tool Execution Logic: Remains identical (read_file, list_files, edit_file)
  • File Operations: Work exactly the same
  • Conversation Flow: Context management preserved
  • Error Handling: Patterns maintained
  • Logging: Enhanced with tool execution logging
  • Interactive Experience: Fully preserved

πŸš€ Usage After Migration

# Start Ollama (one-time setup)
ollama serve

# Pull model (one-time setup)  
ollama pull qwen3:4b

# Use main application
uv run main.py                                    # Uses qwen3:4b by default
uv run main.py --model qwen3:4b                  # Explicit model selection
uv run main.py --server http://remote-host:11434 # Connect to remote Ollama server

# Use runbook examples
uv run runbook/07_add_personality.py                          # Full interactive experience
uv run runbook/06_create_interactive_cli.py --server http://remote:11434  # With remote server
uv run runbook/05_add_chat_method.py                          # Test chat functionality

🌟 Why This Migration Matters: The Future is Local AI

This migration represents more than just a technical change - it's a shift toward democratized, privacy-first AI:

🏠 Local-First AI Movement

  • Data Sovereignty: Your code, your conversations, your intellectual property stays on your infrastructure
  • Independence: No vendor lock-in, no API dependencies, no service outages affecting your workflow
  • Accessibility: AI capabilities available to anyone with modest hardware, not just those who can afford API costs

⚑ Small Models, Big Impact

  • Efficiency Revolution: Modern small models like qwen3:4b deliver 80% of the capability at 10% of the resource cost
  • Democratic Access: Powerful AI available on laptops, development machines, and small servers
  • Sustainable AI: Lower energy consumption, reduced cloud dependency

πŸ”’ Privacy by Design

  • Zero Data Leakage: Code analysis, file processing, and conversations never leave your environment
  • Regulatory Compliance: Perfect for regulated industries requiring data locality
  • Trust: You control the AI, not the other way around

πŸ’° Economic Freedom

  • No Recurring Costs: One-time setup vs. ongoing API charges
  • Unlimited Usage: No rate limits or quotas restricting your productivity
  • Predictable Costs: Hardware investment vs. unpredictable cloud bills

The migration maintains complete feature parity while providing the benefits of local execution, privacy, and cost savings. This is the future of AI development - powerful, private, and under your control.