TTSFM is a free, OpenAI-compatible text-to-speech API service that provides a complete solution for converting text to natural-sounding speech based on OpenAI's GPT-4o mini TTS. Built on top of the openai.fm backend, it offers a powerful Python SDK, RESTful API endpoints, and an intuitive web playground for easy testing and integration.
What TTSFM Can Do:
- π€ Multiple Voices: Choose from 6 high-quality voices (alloy, echo, fable, onyx, nova, shimmer)
- π΅ Flexible Audio Formats: Support for 6 audio formats (MP3, WAV, OPUS, AAC, FLAC, PCM)
- β‘ Speed Control: Adjust playback speed from 0.25x to 4.0x for different use cases
- π Long Text Support: Automatic text splitting and audio combining for content of any length
- π Real-time Streaming: WebSocket support for streaming audio generation
- π Python SDK: Easy-to-use synchronous and asynchronous clients
- π Web Playground: Interactive web interface for testing and experimentation
- π³ Docker Ready: Pre-built Docker images for instant deployment
- π Smart Detection: Automatic capability detection and helpful error messages
- π€ OpenAI Compatible: Drop-in replacement for OpenAI's TTS API
Key Features in v3.4.0:
- π― Image variant detection (full vs slim Docker images)
- π Runtime capabilities API for feature availability checking
- β‘ Speed adjustment with ffmpeg-based audio processing
- π΅ Real format conversion for all 6 audio formats
- π Enhanced error handling with clear, actionable messages
- π³ Dual Docker images optimized for different use cases
β οΈ Disclaimer: This project is intended for educational and research purposes only. It is a reverse-engineered implementation of the openai.fm service and should not be used for commercial purposes or in production environments. Users are responsible for ensuring compliance with applicable laws and terms of service.
pip install ttsfm # core client
pip install ttsfm[web] # client + Flask web appTTSFM offers two Docker image variants to suit different needs:
docker run -p 8000:8000 dbcccc/ttsfm:latestIncludes ffmpeg for advanced features:
- β All 6 audio formats (MP3, WAV, OPUS, AAC, FLAC, PCM)
- β Speed adjustment (0.25x - 4.0x)
- β Format conversion with ffmpeg
- β MP3 auto-combine for long text
- β WAV auto-combine for long text
docker run -p 8000:8000 dbcccc/ttsfm:slimMinimal image without ffmpeg:
- β Basic TTS functionality
- β 2 audio formats (MP3, WAV only)
- β WAV auto-combine for long text
- β No speed adjustment
- β No format conversion
- β No MP3 auto-combine
The container exposes the web playground at http://localhost:8000 and an OpenAI-compatible endpoint at /v1/audio/speech.
Check available features:
curl http://localhost:8000/api/capabilitiesfrom ttsfm import TTSClient, AudioFormat, Voice
client = TTSClient()
# Basic usage
response = client.generate_speech(
text="Hello from TTSFM!",
voice=Voice.ALLOY,
response_format=AudioFormat.MP3,
)
response.save_to_file("hello") # -> hello.mp3
# With speed adjustment (requires ffmpeg)
response = client.generate_speech(
text="This will be faster!",
voice=Voice.NOVA,
response_format=AudioFormat.MP3,
speed=1.5, # 1.5x speed (0.25 - 4.0)
)
response.save_to_file("fast") # -> fast.mp3ttsfm "Hello, world" --voice nova --format mp3 --output hello.mp3# Basic request
curl -X POST http://localhost:8000/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{
"model": "tts-1",
"input": "Hello world!",
"voice": "alloy",
"response_format": "mp3"
}' --output speech.mp3
# With speed adjustment (requires full image)
curl -X POST http://localhost:8000/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{
"model": "tts-1",
"input": "Hello world!",
"voice": "alloy",
"response_format": "mp3",
"speed": 1.5
}' --output speech_fast.mp3Available voices: alloy, echo, fable, onyx, nova, shimmer Available formats: mp3, wav (always) + opus, aac, flac, pcm (full image only) Speed range: 0.25 - 4.0 (requires full image)
- Browse the full API reference and operational notes in the web documentation (or see
ttsfm-web/templates/docs.html). - Read the architecture overview for component diagrams.
- Contributions are welcomeβsee CONTRIBUTING.md for guidelines.
TTSFM is released under the MIT License.