Thanks to visit codestin.com
Credit goes to github.com

Skip to content

TTSFM mirrors OpenAI's TTS service, providing a compatible interface for text-to-speech conversion with multiple voice options for free.

License

Notifications You must be signed in to change notification settings

dbccccccc/ttsfm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

TTSFM - Text-to-Speech API Client

Language / 语言: English | δΈ­ζ–‡

Docker Pulls GitHub Stars License: MIT ghcr pulls

Star History

Star History Chart

Overview

TTSFM is a free, OpenAI-compatible text-to-speech API service that provides a complete solution for converting text to natural-sounding speech based on OpenAI's GPT-4o mini TTS. Built on top of the openai.fm backend, it offers a powerful Python SDK, RESTful API endpoints, and an intuitive web playground for easy testing and integration.

What TTSFM Can Do:

  • 🎀 Multiple Voices: Choose from 6 high-quality voices (alloy, echo, fable, onyx, nova, shimmer)
  • 🎡 Flexible Audio Formats: Support for 6 audio formats (MP3, WAV, OPUS, AAC, FLAC, PCM)
  • ⚑ Speed Control: Adjust playback speed from 0.25x to 4.0x for different use cases
  • πŸ“ Long Text Support: Automatic text splitting and audio combining for content of any length
  • πŸ”„ Real-time Streaming: WebSocket support for streaming audio generation
  • 🐍 Python SDK: Easy-to-use synchronous and asynchronous clients
  • 🌐 Web Playground: Interactive web interface for testing and experimentation
  • 🐳 Docker Ready: Pre-built Docker images for instant deployment
  • πŸ” Smart Detection: Automatic capability detection and helpful error messages
  • πŸ€– OpenAI Compatible: Drop-in replacement for OpenAI's TTS API

Key Features in v3.4.0:

  • 🎯 Image variant detection (full vs slim Docker images)
  • πŸ” Runtime capabilities API for feature availability checking
  • ⚑ Speed adjustment with ffmpeg-based audio processing
  • 🎡 Real format conversion for all 6 audio formats
  • πŸ“Š Enhanced error handling with clear, actionable messages
  • 🐳 Dual Docker images optimized for different use cases

⚠️ Disclaimer: This project is intended for educational and research purposes only. It is a reverse-engineered implementation of the openai.fm service and should not be used for commercial purposes or in production environments. Users are responsible for ensuring compliance with applicable laws and terms of service.

Installation

Python package

pip install ttsfm        # core client
pip install ttsfm[web]   # client + Flask web app

Docker image

TTSFM offers two Docker image variants to suit different needs:

Full variant (recommended)

docker run -p 8000:8000 dbcccc/ttsfm:latest

Includes ffmpeg for advanced features:

  • βœ… All 6 audio formats (MP3, WAV, OPUS, AAC, FLAC, PCM)
  • βœ… Speed adjustment (0.25x - 4.0x)
  • βœ… Format conversion with ffmpeg
  • βœ… MP3 auto-combine for long text
  • βœ… WAV auto-combine for long text

Slim variant - ~100MB

docker run -p 8000:8000 dbcccc/ttsfm:slim

Minimal image without ffmpeg:

  • βœ… Basic TTS functionality
  • βœ… 2 audio formats (MP3, WAV only)
  • βœ… WAV auto-combine for long text
  • ❌ No speed adjustment
  • ❌ No format conversion
  • ❌ No MP3 auto-combine

The container exposes the web playground at http://localhost:8000 and an OpenAI-compatible endpoint at /v1/audio/speech.

Check available features:

curl http://localhost:8000/api/capabilities

Quick start

Python client

from ttsfm import TTSClient, AudioFormat, Voice

client = TTSClient()

# Basic usage
response = client.generate_speech(
    text="Hello from TTSFM!",
    voice=Voice.ALLOY,
    response_format=AudioFormat.MP3,
)
response.save_to_file("hello")  # -> hello.mp3

# With speed adjustment (requires ffmpeg)
response = client.generate_speech(
    text="This will be faster!",
    voice=Voice.NOVA,
    response_format=AudioFormat.MP3,
    speed=1.5,  # 1.5x speed (0.25 - 4.0)
)
response.save_to_file("fast")  # -> fast.mp3

CLI

ttsfm "Hello, world" --voice nova --format mp3 --output hello.mp3

REST API (OpenAI-compatible)

# Basic request
curl -X POST http://localhost:8000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello world!",
    "voice": "alloy",
    "response_format": "mp3"
  }' --output speech.mp3

# With speed adjustment (requires full image)
curl -X POST http://localhost:8000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello world!",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1.5
  }' --output speech_fast.mp3

Available voices: alloy, echo, fable, onyx, nova, shimmer Available formats: mp3, wav (always) + opus, aac, flac, pcm (full image only) Speed range: 0.25 - 4.0 (requires full image)

Learn more

License

TTSFM is released under the MIT License.

About

TTSFM mirrors OpenAI's TTS service, providing a compatible interface for text-to-speech conversion with multiple voice options for free.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors 9