OnomaTool is your AI-powered file renaming assistant! 🧠✨
- Rename files in bulk with smart, context-aware suggestions 🤖
- Supports PDFs, images, markdown, SVG, PPTX, DOCX, TXT, and more! 📄🖼️
- Always preserves file extensions 🔒
- CLI with dry-run, interactive, verbose, and debug modes 🖥️
- Configurable via
.onomarc
TOML config file ⚙️ - Uses Markitdown for unified file processing 📝
- NEW: Advanced UTF-8 encoding detection and conversion for text files 🔤
- NEW: Configurable word count limits for filenames 🔤
- 🦾 AI Suggestions: Get 3 smart file name ideas for every file
- 🤖 Multiple LLM Providers: OpenAI (including local endpoints) and Google Gemini support
- 🧩 Conflict Resolution: Never overwrite files - automatic numeric suffix handling
- 🔒 Extension Preservation: Original file extensions are always preserved
- 📁 Glob Pattern Support: Process files using flexible glob patterns
- 📄 PDF Files: Extract markdown content + generate images for each page
- 🖼️ SVG Files: Convert to PNG for AI analysis (enforced PNG-only processing)
- 📊 PPTX Files: Extract content + generate images for each slide using LibreOffice
- 📝 Text Files: UTF-8 encoding detection and conversion + markdown processing
- 🖼️ Image Files: Base64 encoding for direct AI image analysis
- 📑 Office Documents: DOCX, XLSX support via Markitdown
- 🔤 Unicode Support: Automatic encoding detection for text files with chardet
- 🧪 Dry-Run Mode: Preview changes without modifying files (
--dry-run
) - 🤝 Interactive Mode: Confirm changes after dry-run preview (
--interactive
) - 🔍 Debug Mode: Preserve temp files and show processing paths (
--debug
) - 📢 Verbose Mode: Show LLM requests and responses (
--verbose
) - ⚙️ Config Generation: Generate default config file (
--save-config
)
- 🎯 Smart Processing: Combined image + text analysis for documents
- 🏗️ Modular Architecture: Extensible processor system
- 🌐 Local LLM Support: Works with local OpenAI-compatible endpoints
- 📊 Multiple Naming Conventions: snake_case, CamelCase, kebab-case, and more
- 🛡️ SSL Flexibility: Automatic SSL handling for local/HTTP endpoints
- 🔤 Encoding Intelligence: Automatic detection and UTF-8 conversion for text files
- 🧪 Comprehensive Testing: 13+ test cases for encoding reliability
# Clone the repository
git clone https://github.com/yourusername/onomatool.git
cd onomatool
# Create virtual environment (recommended)
python3 -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# For encoding detection (included in requirements.txt)
pip install chardet
# Install the package
pip install -e .
# Install directly from the repository
pip install git+https://github.com/yourusername/onomatool.git
For full functionality (SVG, PDF, PPTX processing):
# Ubuntu/Debian
sudo apt-get update
sudo apt-get install libreoffice imagemagick libcairo2 libpango-1.0-0 libpangocairo-1.0-0
# macOS (with Homebrew)
brew install libreoffice imagemagick cairo pango
# Windows: Download and install LibreOffice and ImageMagick
# For SVG support: pip install cairosvg (requires Cairo system libraries)
# Rename all PDFs in current directory
onomatool '*.pdf'
# Process files in subdirectories
onomatool 'docs/**/*.md'
# Specify file format explicitly
onomatool '*.unknown' --format pdf
# See what would be renamed (no changes made)
onomatool '*.jpg' --dry-run
# Interactive confirmation after preview
onomatool '*.pdf' --dry-run --interactive
# Debug mode - preserve temp files
onomatool '*.svg' --debug
# Verbose mode - see LLM interactions
onomatool '*.docx' --verbose
# Combined modes
onomatool '*.pptx' --debug --verbose --dry-run
OnomaTool uses a TOML configuration file at ~/.onomarc
or a custom path with --config
.
onomatool --save-config
# API Configuration
default_provider = "openai" # or "google"
openai_api_key = "sk-..."
openai_base_url = "https://api.openai.com/v1" # or local endpoint
google_api_key = "your-google-api-key"
# Model and Behavior
llm_model = "gpt-4o" # or "gemini-pro"
naming_convention = "snake_case" # snake_case, CamelCase, kebab-case, etc.
# Custom Prompts (optional - defaults provided)
system_prompt = "You are a file naming assistant."
user_prompt = "Suggest 3 file names for: {content}"
image_prompt = "Suggest 3 file names for this image."
# Markitdown Configuration
[markitdown]
enable_plugins = false
docintel_endpoint = ""
# Word count limits (NEW!)
min_filename_words = 5 # Minimum words required (ensures descriptive names)
max_filename_words = 15 # Maximum words allowed (prevents overly long names)
snake_case
(default)CamelCase
kebab-case
PascalCase
dot.notation
natural language
File Type | Processing Method | Output |
---|---|---|
Markitdown + PyMuPDF page images | Combined text + image analysis | |
PPTX | Markitdown + LibreOffice slide images | Combined text + image analysis |
SVG | Convert to PNG + Markitdown | Image analysis only |
Images (JPG, PNG, etc.) | Base64 encoding | Direct image analysis |
DOCX | Markitdown processing | Text analysis |
TXT, MD, NOTE | UTF-8 encoding detection + text processing | Text analysis |
XLSX | Markitdown processing | Content analysis |
CSV, JSON, XML, HTML | UTF-8 encoding detection + Markitdown | Content analysis |
Code Files (PY, JS, CSS, YAML) | UTF-8 encoding detection + text processing | Code analysis |
src/onomatool/
├── cli.py # Command-line interface
├── config.py # Configuration management
├── llm_integration.py # OpenAI/Google API integration
├── file_dispatcher.py # File routing logic
├── processors/ # File processing modules
│ ├── markitdown_processor.py
│ └── text_processor.py
├── utils/ # Utility functions
│ └── image_utils.py # SVG conversion utilities
├── prompts.py # Default prompts
├── renamer.py # File renaming logic
├── conflict_resolver.py # Filename conflict handling
└── file_collector.py # Glob pattern matching
# Install test dependencies
pip install pytest pytest-mock
# Run all tests
pytest
# Run specific test suites
pytest tests/test_usage_enduser.py # End-to-end user tests
pytest tests/test_utf8_encoding.py # UTF-8 encoding tests
# Run with coverage
pytest --cov=onomatool
# Format code
ruff format .
# Lint code
ruff check --fix .
# Run all checks
ruff check . && ruff format --check .
# Use custom config file
onomatool '*.pdf' --config /path/to/custom.toml
# In your .onomarc
default_provider = "openai"
openai_base_url = "http://localhost:1234/v1"
openai_api_key = "not-needed-for-local"
# Force format detection
onomatool 'unknown_files/*' --format pdf
onomatool 'images/*' --format image
- No Overwrites: Built-in conflict resolution with numeric suffixes
- Extension Preservation: Original file extensions always maintained
- Dry-Run Mode: Preview all changes before execution
- Temp File Management: Automatic cleanup (preservable in debug mode)
- Error Handling: Graceful failure with clear error messages
- Encoding Safety: Automatic UTF-8 conversion preserves original files
- Unicode Compatibility: Handles em dashes, smart quotes, accented characters
We welcome contributions! Please:
- Fork the repository 🍴
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Write tests for your changes 🧪
- Follow PEP8 and run
ruff check --fix .
🐍 - Update
CHANGELOG.md
andFILETREE.md
📚 - Submit a pull request 🚀
# Clone and setup development environment
git clone https://github.com/yourusername/onomatool.git
cd onomatool
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install -e .
# Install development dependencies
pip install pytest pytest-mock ruff
# Run tests to verify setup
pytest tests/test_utf8_encoding.py
pytest tests/test_usage_enduser.py
MIT License - see LICENSE file for details.
Q: Does it work on Windows/Mac/Linux? A: Yes! Cross-platform support with Python 3.10+.
Q: Can I use local LLMs?
A: Yes! Set openai_base_url
to your local endpoint in .onomarc
.
Q: Will it overwrite my files? A: Never! Built-in conflict resolution prevents overwrites.
Q: What if my API key is invalid? A: The tool will show clear error messages and fail gracefully.
Q: Can I customize the AI prompts?
A: Yes! Set system_prompt
, user_prompt
, and image_prompt
in your config.
Q: How does SVG processing work? A: SVGs are converted to PNG images before AI analysis for better results.
Q: Can I see what the AI is thinking?
A: Use --verbose
to see full LLM requests and responses.
Q: What about files with special characters or different encodings? A: OnomaTool automatically detects and converts file encodings to UTF-8, handling em dashes, accented characters, and other Unicode symbols seamlessly.
Q: Does it work with files that have encoding issues? A: Yes! The tool uses advanced encoding detection to identify and convert problematic files while preserving the original content.
Q: How do word count limits affect filename generation? A: Word count limits control the minimum and maximum number of words in generated filenames. This helps maintain descriptive and concise naming conventions.
Made with ❤️, AI, and a lot of
import os
.