Thanks to visit codestin.com
Credit goes to www.libhunt.com

Python Speech

Open-source Python projects categorized as Speech

Top 23 Python Speech Projects

  1. TTS

    🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

    Project mention: 2025 Voice AI Guide: How to Make Your Own Real-Time Voice Agent (Part-1) | dev.to | 2025-09-20

    XTTS-v2 — Zero-shot voice cloning, 17 languages, streaming support

  2. InfluxDB

    InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.

    InfluxDB logo
  3. MockingBird

    🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time

  4. datasets

    🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools

    Project mention: Training with Big Data on Any Cloud | dev.to | 2025-06-20

    Hugging Face Datasets -- the library that lets you download and manage datasets from the Hugging Face Hub, as well as being a convenient vendor-neutral interface for your own datasets.

  5. whisperX

    WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

    Project mention: Making AI Models Faster, Cheaper, and Greener — Here’s How | dev.to | 2025-11-03

    2.3X speed improvement over WhisperX and a 3X speed boost compared to HuggingFace Pipeline with FlashAttention 2 (Insanely Fast Whisper)

  6. AudioGPT

    AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

  7. modelscope

    ModelScope: bring the notion of Model-as-a-Service to life.

  8. EmotiVoice

    EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

  9. Stream

    Stream - Scalable APIs for Chat, Feeds, Moderation, & Video. Stream helps developers build engaging apps that scale to millions with performant and flexible Chat, Feeds, Moderation, and Video APIs and SDKs powered by a global edge network and enterprise-grade infrastructure.

    Stream logo
  10. silero-vad

    Silero VAD: pre-trained enterprise-grade Voice Activity Detector

    Project mention: 2025 Voice AI Guide: How to Make Your Own Real-Time Voice Agent (Part-1) | dev.to | 2025-09-20

    Silero VAD is the gold standard and pipecat has builtin support so I have choosen that :

  11. ultravox

    A fast multimodal LLM for real-time voice

    Project mention: I Open-Sourced My AI Toy Company That Runs on ESP32 and OpenAI Realtime API | news.ycombinator.com | 2025-04-22

    This looks like so much fun! I have recently gotten into working with electronics, so it seems like a nice little project to undertake.

    I noticed that it is dependent on openAIs realtime API, so it got me wondering what open alternatives there are.

    I could only find ultravox (https://github.com/fixie-ai/ultravox) that would seem to really work as realtime. It seems to be some model that wires up llama and whisper somehow, rather than treating them as separate steps which is common with other projects,

    What other options are available for this kind of real-time behaviour?

  12. speech-to-speech

    Speech To Speech: an effort for an open-sourced and modular GPT4-o

  13. metavoice-src

    Foundational model for human-like, expressive TTS

  14. DeepFilterNet

    Noise supression using deep filtering

    Project mention: Show HN: Background noise removal in multimedia with a single command | news.ycombinator.com | 2025-10-06
  15. whisper-asr-webservice

    OpenAI Whisper ASR Webservice API

  16. lingvo

    Lingvo

  17. aeneas

    aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)

  18. whisper-timestamped

    Multilingual Automatic Speech Recognition with word-level timestamps and confidence

  19. gTTS

    Python library and CLI tool to interface with Google Translate's text-to-speech API

  20. IMS-Toucan

    Controllable and fast Text-to-Speech for over 7000 languages!

  21. openai-edge-tts

    Free, high-quality text-to-speech API endpoint to replace OpenAI, Azure, or ElevenLabs

    Project mention: Open source TTS by Resemble (claiming they are sota) | news.ycombinator.com | 2025-06-11

    It can definitely run on CPU — but I'm not sure if it can run on a machine without a GPU _entirely_.

    To be honest, it uses a decently large amount of resources. If you had a GPU, you could expect about 4-5 gb memory usage. And given the optimizations for tensors on GPUs, I'm not sure how well thinks would work "CPU only".

    If you try it, let me know. There are some "CPU" Docker builds in the repo you could look at for guidance.

    If you want free TTS without using local resources, you could try edge-tts https://github.com/travisvn/openai-edge-tts

  22. SALMONN

    SALMONN family: A suite of advanced multi-modal LLMs

  23. voicefixer

    General Speech Restoration

  24. StreamSpeech

    StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

  25. dc_tts

    A TensorFlow Implementation of DC-TTS: yet another text-to-speech model

  26. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python Speech discussion

Log in or Post with

Python Speech related posts

  • Making AI Models Faster, Cheaper, and Greener — Here’s How

    5 projects | dev.to | 3 Nov 2025
  • 2025 Voice AI Guide: How to Make Your Own Real-Time Voice Agent (Part-1)

    7 projects | dev.to | 20 Sep 2025
  • Ask HN: What Speaker Diarization tools should I look into?

    1 project | news.ycombinator.com | 23 Jul 2025
  • Training with Big Data on Any Cloud

    4 projects | dev.to | 20 Jun 2025
  • Show HN: Mikey – No bot meeting notetaker for Windows

    6 projects | news.ycombinator.com | 12 Feb 2025
  • Ask HN: Is Whisper Still Relevant?

    2 projects | news.ycombinator.com | 12 Feb 2025
  • Show HN: Using YOLO to Detect Office Chairs in 40M Hotel Photos

    4 projects | news.ycombinator.com | 25 Jan 2025
  • A note from our sponsor - SaaSHub
    www.saashub.com | 16 Nov 2025
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Speech projects in Python? This list will help you:

# Project Stars
1 TTS 43,441
2 MockingBird 36,745
3 datasets 20,844
4 whisperX 18,709
5 AudioGPT 10,200
6 modelscope 8,452
7 EmotiVoice 8,367
8 silero-vad 7,348
9 ultravox 4,258
10 speech-to-speech 4,230
11 metavoice-src 4,191
12 DeepFilterNet 3,407
13 whisper-asr-webservice 3,007
14 lingvo 2,854
15 aeneas 2,742
16 whisper-timestamped 2,657
17 gTTS 2,549
18 IMS-Toucan 1,654
19 openai-edge-tts 1,377
20 SALMONN 1,352
21 voicefixer 1,232
22 StreamSpeech 1,192
23 dc_tts 1,160

Sponsored
InfluxDB – Built for High-Performance Time Series Workloads
InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
www.influxdata.com

Did you know that Python is
the 2nd most popular programming language
based on number of references?