asr

Star

Here are 1,904 public repositories matching this topic...

m-bain / whisperX

Sponsor

Star

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

speech speech-recognition speech-to-text whisper asr

Updated Apr 4, 2026
Python

NVIDIA-NeMo / NeMo

Star

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

machine-translation tts speech-synthesis neural-networks deeplearning speaker-recognition asr speech-translation speaker-diariazation generative-ai

Updated May 13, 2026
Python

alphacep / vosk-api

Star

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Updated Feb 22, 2026
Jupyter Notebook

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Updated May 7, 2026
Python

k2-fsa / sherpa-onnx

Star

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC-V, RK NPU, Axera NPU, Ascend NPU, x86_64 servers, websocket server/client, support 12 programming languages

Updated May 13, 2026
C++

speechbrain / speechbrain

Star

A PyTorch-based Speech Toolkit

Updated May 3, 2026
Python

FunAudioLLM / SenseVoice

Star

Multilingual Voice Understanding Model

multilingual python ai pytorch speech-recognition speech-to-text asr cross-lingual speech-emotion-recognition audio-event-classification aigc llm gpt-4o

Updated Dec 30, 2025
Python

jdepoix / youtube-transcript-api

Sponsor

Star

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!

python cli youtube youtube-video youtube-api captions subtitles transcript subtitle transcripts asr youtube-subtitles youtube-transcripts youtube-captions youtube-transcript translating-transcripts youtube-asr

Updated May 6, 2026
Python

wzpan / wukong-robot

Sponsor

Star

🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首个支持脑机交互的开源智能音箱项目。

alexa ai amazon-echo muse tts openai google-home unit bci speaker homeassistant snowboy asr anyq raspeberry-pi gpt3 chatgpt

Updated Oct 25, 2024
Python

xiangyuecn / Recorder

Star

html5 js 录音 mp3 wav ogg webm amr g711a g711u 格式，支持pc和Android、iOS部分Web浏览器、Hybrid App（提供Android iOS App源码）、微信，提供ASR语音识别转文字 H5版语音通话聊天示例 DTMF编码解码

Updated Apr 27, 2026
JavaScript

MahmoudAshraf97 / whisper-diarization

Star

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

speech speech-recognition speech-to-text whisper asr speaker-diarization

Updated Feb 23, 2026
Jupyter Notebook

wenet-e2e / wenet

Star

Production First and Production Ready End-to-End Speech Recognition Toolkit

pytorch transformer speech-recognition automatic-speech-recognition production-ready whisper asr conformer e2e-models

Updated May 11, 2026
Python

umlx5h / LLPlayer

Star

The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more!

player ocr video csharp video-player wpf language-learning media-player whisper asr flyleaf yt-dlp llm ollama

Updated Apr 19, 2026
C#

PeterH0323 / Streamer-Sales

Star

Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁，一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️、Vue 生态搭建前端🍍、FastAPI 搭建后端🗝️、Docker-compose 打包部署🐋