Supertonic TTS Studio

Supertonic TTS Studio is a professional text-to-speech application that supports almost real-time continuous streaming and full audio generation while using CPU. This project leverages the Supertonic repository for high-quality TTS synthesis using ONNX models and multiple voice styles.

Features

Real-time continuous streaming of generated speech.
Full audio file generation with download capability.
Multiple voice styles with descriptions.
Adjustable speaking speed.
Character and word count tracking for input text.
Professional Gradio-based user interface.

Installation

Clone this repository:

git clone https://github.com/aritrodium/Supertonic-TTS.git
cd Supertonic-TTS

Ensure you have the required dependencies installed:

pip install -r requirements.txt
pip install gradio numpy soundfile

Clone the Supertonic repository and pull LFS files:

git lfs install
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/Supertone/supertonic supertonic
cd supertonic
git lfs pull

Usage

Run the application using Python:

python app.py

The Gradio interface will launch at http://0.0.0.0:7860 and can be accessed via your browser or shared using the share=True option.

Input Controls

Text Input: Enter the text to synthesize. Supports multiple paragraphs.
Voice Style: Select a preferred voice style from the available options.
Speaking Speed: Adjust the rate of speech generation.

Output

Streaming Output: Continuous playback of generated speech chunks.
Download Output: Complete audio file generation and download.

Project Structure

Supertonic-TTS-/
├── app.py                 # Main Gradio application
├── supertonic/            # Cloned Supertonic repository
│   ├── onnx/              # Pretrained ONNX models
│   ├── voice_styles/      # JSON files for voice styles
│   └── tts_model.py       # TTS model functions
└── README.md

Functions

run_tts_stream(text, speed, style_name) – Generates speech in streaming mode, yielding audio chunks and status updates.
run_tts_full(text, speed, style_name) – Generates a complete audio file for download.
list_voice_styles() – Lists all available voice style JSON files.
load_style_by_name(filename) – Loads a specific voice style.

Contributing

Contributions are welcome. Please fork the repository, make your changes, and submit a pull request with a clear description of the changes.

License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Supertonic TTS Studio

Features

Installation

Usage

Input Controls

Output

Project Structure

Functions

Contributing

License

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
screenshot		screenshot
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
tts_model.py		tts_model.py

aritrodium/Supertonic-TTS

Folders and files

Latest commit

History

Repository files navigation

Supertonic TTS Studio

Features

Installation

Usage

Input Controls

Output

Project Structure

Functions

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages