Comprehensive Gradio WebUI for audio processing
A deep learning toolkit for Text-to-Speech, battle-tested in research
1 min voice data can also be used to train a good TTS model
elevenlabs-api is an open source Java wrapper around the ElevenLabs
Just a Better Chatbot. Powered by MCP Client & Workflows
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Easy-to-use Speech Toolkit including Self-Supervised Learning model
Simple and powerful voice changer for Linux, written with Python & GTK
A Python/Pytorch app for easily synthesising human voices
Singing voice change based on whisper, lora for singing voice clone
MARS5 is a fully open-source, hyper-realistic text-to-speech (TTS).
[WIP] VoiceSmith makes training text to speech models easy
VoiceOver is a web application that allows you to transcribe audio
Clone a voice in 5 seconds to generate arbitrary speech in real-time
PAddle PARAllel text-to-speech toolKIT
An implementation of Tacotron 2 that supports multilingual experiments
A Python/Pytorch app for easily synthesising human voices
Dia-1.6B generates lifelike English dialogue and vocal expressions