MP3Detective is a powerful tool that automatically updates audio file metadata using AI language models. It supports both OpenAI's GPT models and local LLM models via Ollama. The tool supports multiple audio formats and can identify song details and update metadata tags for any music genre and language.
- π€ Dual LLM Support: Choose between OpenAI GPT models or local Ollama models
- π΅ Multi-format Support: MP3, FLAC, M4A, MP4, OGG, OPUS
- π Universal Music Knowledge: Supports all genres and languages
- π Batch Processing: Process multiple files with progress tracking
- π‘οΈ Privacy Options: Use local models to keep data private
- π° Cost Control: Avoid API costs with local LLM inference
- π Smart Metadata: Automatically identifies title, artist, album, year, composer, genre, language
- π Safe Processing: Preserves original files by creating copies
- π Detailed Logging: Comprehensive logs for troubleshooting
- Python 3.8 or higher
- Git (for cloning the repository)
- 8GB RAM minimum (16GB+ recommended for larger models)
Choose ONE of the following:
- OpenAI API key (Get one here)
- Internet connection required
- Pay-per-use pricing model
System Requirements for Ollama:
| Component | Minimum | Recommended | Notes |
|---|---|---|---|
| RAM | 8GB | 16GB+ | More RAM = larger models |
| Storage | 10GB | 25GB+ | Models range from 2GB-40GB+ |
| CPU | Any modern CPU | Intel 11th Gen+ / AMD Zen4+ / Apple M-series | AVX512 support helps |
| GPU | None (CPU only) | NVIDIA GTX 1060+ (8GB+ VRAM) | Dramatically improves speed |
| OS | Windows 10+, macOS 11+, Linux | Latest versions | Better compatibility |
Model Size vs RAM Requirements:
- Small models (4B params): 8GB+ RAM β gemma3:4b
- Medium models (7-8B params): 12-16GB+ RAM β mistral:7b, llama3.1:8b
- Large models (12B params): 16GB+ RAM β gemma3:12b
# Clone the repository
git clone https://github.com/deepakness/mp3detective.git
cd mp3detective
# Create and activate virtual environment
python -m venv venv
# On Windows
venv\Scripts\activate
# On macOS/Linux
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt# Edit app.py and configure:
LLM_PROVIDER = "openai"
OPENAI_API_KEY = "your-actual-api-key-here"
OPENAI_MODEL = "gpt-4o" # or gpt-4o-mini, gpt-3.5-turboWindows:
# Download from https://ollama.ai
# Or use package managers:
winget install Ollama.Ollama
# OR
choco install ollamamacOS:
# Download from https://ollama.ai
# Or use Homebrew:
brew install ollamaLinux:
curl -fsSL https://ollama.ai/install.sh | sh# Start with a recommended model (choose one):
# Best performance - Most capable model
ollama pull gemma3:12b # 8GB model, requires ~16GB RAM
# Good balance - Recommended for most users
ollama pull gemma3:4b # 3.5GB model, requires ~8GB RAM
# Excellent general purpose - Meta's latest
ollama pull llama3.1:8b # 5GB model, requires ~12GB RAM
# Fast and efficient - Great for beginners
ollama pull mistral:7b # 5GB model, requires ~12GB RAM
# Check downloaded models
ollama list# Edit app.py and configure:
LLM_PROVIDER = "ollama"
OLLAMA_BASE_URL = "http://localhost:11434" # Default Ollama URL
OLLAMA_MODEL = "gemma3:4b" # Use the model you downloaded# Place audio files in the input/ directory
# Supported formats: MP3, FLAC, M4A, MP4, OGG, OPUS
# Run the application
python app.pyEdit the configuration section in app.py:
# Choose your LLM provider
LLM_PROVIDER = "openai" # Options: "openai", "ollama"
# OpenAI Configuration (only if using OpenAI)
OPENAI_API_KEY = "your-api-key-here"
OPENAI_MODEL = "gpt-4o" # gpt-4o, gpt-4o-mini, gpt-3.5-turbo
# Ollama Configuration (only if using Ollama)
OLLAMA_BASE_URL = "http://localhost:11434"
OLLAMA_MODEL = "gemma3:4b" # gemma3:12b, gemma3:4b, llama3.1:8b, mistral:7bINPUT_FOLDER = "input" # Source audio files directory
OUTPUT_FOLDER = "output" # Processed files directory
BATCH_SIZE = 10 # Progress update frequency
RATE_LIMIT_DELAY = 1.0 # Delay between API calls (seconds)
OVERWRITE = True # Overwrite existing metadata| Model | Best For | Speed | Quality | Cost |
|---|---|---|---|---|
gpt-4o |
Best overall quality | Medium | βββββ | $$$ |
gpt-4o-mini |
Balanced performance | Fast | ββββ | $$ |
gpt-3.5-turbo |
Speed and economy | Very Fast | βββ | $ |
| Model | Size | RAM Needed | Best For | Performance |
|---|---|---|---|---|
gemma3:12b |
~8GB | 16GB+ | Best quality & accuracy | βββββ |
gemma3:4b |
~3.5GB | 8GB+ | Balanced performance | ββββ |
llama3.1:8b |
~5GB | 12GB+ | General purpose excellence | βββββ |
mistral:7b |
~5GB | 12GB+ | Fast & efficient | ββββ |
mp3detective/
βββ .gitignore # Git ignore rules
βββ app.py # Main application file
βββ requirements.txt # Python dependencies
βββ README.md # This documentation
βββ input/ # Place your audio files here
βββ output/ # Updated files will appear here
-
Prepare Audio Files: Place your audio files (MP3, FLAC, M4A, MP4, OGG, OPUS) in the
input/folder -
Configure LLM Provider: Choose between OpenAI or Ollama in
app.py -
Run the Application:
python app.py
-
Monitor Progress: Watch the console for processing updates
-
Review Results:
- Check
output/folder for processed files - Review
metadata_updater.logfor detailed information
- Check
# If you get module errors:
pip install -r requirements.txt
# If you're in wrong directory:
cd mp3detective
# If Python environment issues:
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows- Invalid API Key: Verify your key at OpenAI Dashboard
- Insufficient Credits: Check your OpenAI billing
- Network Issues: Ensure stable internet connection
- Rate Limits: Increase
RATE_LIMIT_DELAYin configuration
Installation Problems:
# Check if Ollama is running
ollama serve
# Verify installation
ollama --version
# List available models
ollama listModel Download Issues:
# Download specific model
ollama pull gemma3:4b
# Check download status
ollama list
# Remove and re-download if corrupted
ollama rm gemma3:4b
ollama pull gemma3:4bConnection Issues:
- Server not running: Start Ollama with
ollama serve - Wrong URL: Verify
OLLAMA_BASE_URLmatches your Ollama server - Model not found: Ensure model name matches exactly (case-sensitive)
- Memory issues: Close other applications to free up RAM
- Unsupported format: Convert to MP3, FLAC, M4A, MP4, OGG, or OPUS
- Corrupted files: Try with different audio files
- Permission errors: Ensure write permissions for
output/directory
- OpenAI: Data sent to OpenAI servers (subject to their privacy policy)
- Ollama: All processing stays on your local machine (completely private)
- Never commit API keys to version control
- Keep your OpenAI API key secure and rotate regularly
- Use environment variables for sensitive configuration
- Backup your original audio files before processing
- Keep original files backed up separately
- Use descriptive filenames for better AI recognition
- Process files in small batches initially to test configuration
- For accuracy: Use OpenAI GPT-4o or gemma3:12b/llama3.1:8b
- For speed: Use OpenAI GPT-3.5-turbo or gemma3:4b/mistral:7b
- For privacy: Always use Ollama models
- For cost: Use Ollama models or OpenAI GPT-3.5-turbo
- Close unnecessary applications when using Ollama
- Use GPU acceleration if available
- Adjust
RATE_LIMIT_DELAYbased on your setup - Process files during off-peak hours for better performance
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License.
- OpenAI for providing powerful GPT models via API
- Ollama for making local LLM inference accessible
- Mutagen for comprehensive audio metadata manipulation
- tqdm for beautiful progress bars
- requests for reliable HTTP communication
If you encounter any issues or have questions:
- Check the Issues page
- Create a new issue if your problem isn't already reported
- Include relevant logs from
metadata_updater.log - Specify your OS, Python version, and LLM provider
Based on testing with diverse music collections:
- Overall accuracy: ~95% for mainstream music
- Genre variations: Pop/Rock (98%) > Classical (92%) > Experimental (85%)
- Language support: Excellent for major languages, good for regional languages
- Era coverage: Recent music (98%) > Vintage music (90%) > Obscure tracks (80%)
Note: Accuracy depends on the LLM model used and the clarity of filenames. Always verify metadata for critical collections.
π΅ Happy metadata organizing! π΅