Top 23 Python Speech Projects

TTS

1 243 43,441 8.1 Python

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Project mention: 2025 Voice AI Guide: How to Make Your Own Real-Time Voice Agent (Part-1) | dev.to | 2025-09-20

XTTS-v2 — Zero-shot voice cloning, 17 languages, streaming support
Stream

getstream.io featured

Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.
MockingBird

2 9 36,745 5.6 Python

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
datasets

3 18 20,844 9.4 Python

🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

Project mention: Training with Big Data on Any Cloud | dev.to | 2025-06-20

Hugging Face Datasets -- the library that lets you download and manage datasets from the Hugging Face Hub, as well as being a convenient vendor-neutral interface for your own datasets.
whisperX

4 37 18,709 8.6 Python

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Project mention: Making AI Models Faster, Cheaper, and Greener — Here’s How | dev.to | 2025-11-03

2.3X speed improvement over WhisperX and a 3X speed boost compared to HuggingFace Pipeline with FlashAttention 2 (Insanely Fast Whisper)
AudioGPT

5 4 10,200 0.0 Python

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
modelscope

6 3 8,452 9.4 Python

ModelScope: bring the notion of Model-as-a-Service to life.
EmotiVoice

7 5 8,367 7.9 Python

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
InfluxDB

www.influxdata.com featured

InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
silero-vad

8 15 7,348 8.4 Python

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Project mention: 2025 Voice AI Guide: How to Make Your Own Real-Time Voice Agent (Part-1) | dev.to | 2025-09-20

Silero VAD is the gold standard and pipecat has builtin support so I have choosen that :
ultravox

9 6 4,258 6.1 Python

A fast multimodal LLM for real-time voice

Project mention: I Open-Sourced My AI Toy Company That Runs on ESP32 and OpenAI Realtime API | news.ycombinator.com | 2025-04-22

This looks like so much fun! I have recently gotten into working with electronics, so it seems like a nice little project to undertake.
I noticed that it is dependent on openAIs realtime API, so it got me wondering what open alternatives there are.
I could only find ultravox (https://github.com/fixie-ai/ultravox) that would seem to really work as realtime. It seems to be some model that wires up llama and whisper somehow, rather than treating them as separate steps which is common with other projects,
What other options are available for this kind of real-time behaviour?
speech-to-speech

10 3 4,230 8.7 Python

Speech To Speech: an effort for an open-sourced and modular GPT4-o
metavoice-src

11 5 4,191 7.8 Python

Foundational model for human-like, expressive TTS
DeepFilterNet

12 13 3,407 7.3 Python

Noise supression using deep filtering

Project mention: Show HN: Background noise removal in multimedia with a single command | news.ycombinator.com | 2025-10-06
whisper-asr-webservice

13 11 3,007 8.1 Python

OpenAI Whisper ASR Webservice API
lingvo

14 1 2,854 6.2 Python

Lingvo
aeneas

15 4 2,742 0.0 Python

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
whisper-timestamped

16 2 2,657 6.1 Python

Multilingual Automatic Speech Recognition with word-level timestamps and confidence
gTTS

17 3 2,549 4.1 Python

Python library and CLI tool to interface with Google Translate's text-to-speech API
IMS-Toucan

18 1 1,654 7.3 Python

Controllable and fast Text-to-Speech for over 7000 languages!
openai-edge-tts

19 1 1,377 7.4 Python

Free, high-quality text-to-speech API endpoint to replace OpenAI, Azure, or ElevenLabs

Project mention: Open source TTS by Resemble (claiming they are sota) | news.ycombinator.com | 2025-06-11

It can definitely run on CPU — but I'm not sure if it can run on a machine without a GPU _entirely_.
To be honest, it uses a decently large amount of resources. If you had a GPU, you could expect about 4-5 gb memory usage. And given the optimizations for tensors on GPUs, I'm not sure how well thinks would work "CPU only".
If you try it, let me know. There are some "CPU" Docker builds in the repo you could look at for guidance.
If you want free TTS without using local resources, you could try edge-tts https://github.com/travisvn/openai-edge-tts
SALMONN

20 2 1,352 7.3 Python

SALMONN family: A suite of advanced multi-modal LLMs
voicefixer

21 2 1,232 3.5 Python

General Speech Restoration
StreamSpeech

22 3 1,192 3.3 Python

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
dc_tts

23 4 1,160 0.0 Python

A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
SaaSHub

www.saashub.com featured

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Speech discussion

Python Speech related posts

Making AI Models Faster, Cheaper, and Greener — Here’s How

5 projects | dev.to | 3 Nov 2025
2025 Voice AI Guide: How to Make Your Own Real-Time Voice Agent (Part-1)

7 projects | dev.to | 20 Sep 2025
Ask HN: What Speaker Diarization tools should I look into?

1 project | news.ycombinator.com | 23 Jul 2025
Training with Big Data on Any Cloud

4 projects | dev.to | 20 Jun 2025
Show HN: Mikey – No bot meeting notetaker for Windows

6 projects | news.ycombinator.com | 12 Feb 2025
Ask HN: Is Whisper Still Relevant?

2 projects | news.ycombinator.com | 12 Feb 2025
Show HN: Using YOLO to Detect Office Chairs in 40M Hotel Photos

4 projects | news.ycombinator.com | 25 Jan 2025
A note from our sponsor - SaaSHub
www.saashub.com | 15 Nov 2025

SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Speech projects in Python? This list will help you:

#	Project	Stars
1	TTS	43,441
2	MockingBird	36,745
3	datasets	20,844
4	whisperX	18,709
5	AudioGPT	10,200
6	modelscope	8,452
7	EmotiVoice	8,367
8	silero-vad	7,348
9	ultravox	4,258
10	speech-to-speech	4,230
11	metavoice-src	4,191
12	DeepFilterNet	3,407
13	whisper-asr-webservice	3,007
14	lingvo	2,854
15	aeneas	2,742
16	whisper-timestamped	2,657
17	gTTS	2,549
18	IMS-Toucan	1,654
19	openai-edge-tts	1,377
20	SALMONN	1,352
21	voicefixer	1,232
22	StreamSpeech	1,192
23	dc_tts	1,160

Python Speech

Top 23 Python Speech Projects

Python Speech discussion

Python Speech related posts

Making AI Models Faster, Cheaper, and Greener — Here’s How

2025 Voice AI Guide: How to Make Your Own Real-Time Voice Agent (Part-1)

Ask HN: What Speaker Diarization tools should I look into?

Training with Big Data on Any Cloud

Show HN: Mikey – No bot meeting notetaker for Windows

Ask HN: Is Whisper Still Relevant?

Show HN: Using YOLO to Detect Office Chairs in 40M Hotel Photos

Index

Did you know that Python is the 2nd most popular programming language based on number of references?

Did you know that Python is
the 2nd most popular programming language
based on number of references?