Stars
SoftVC VITS Singing Voice Conversion
[ACL 2025 Oral] Language-Codec: Reducing the Gaps Between Discrete Codec Representation and Speech Language Models
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
Vector (and Scalar) Quantization, in Pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
speech self-supervised representations
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
Recurrent neural network for audio noise reduction
Simple and Efficient Tensorflow implementations of NER models with tf.estimator and tf.data
Facebook AI Research's Automatic Speech Recognition Toolkit
End-to-End speech recognition implementation base on TensorFlow (CTC, Attention, and MTL training)
End-to-end ASR/LM implementation with PyTorch
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translatio…
Attempt at tracking states of the arts and recent results (bibliography) on speech recognition.
A PyTorch Implementation of End-to-End Models for Speech-to-Text
The RWTH extensible training framework for universal recurrent neural networks
This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation) for end-to-end ASR implemented with Pytorch, the well known deep learning toolkit.
Speech-to-Text-WaveNet : End-to-end sentence level English speech recognition based on DeepMind's WaveNet and tensorflow
Speech Recognition using DeepSpeech2.
kaldi-asr/kaldi is the official location of the Kaldi project.
Reference implementations of MLPerf® training benchmarks
Models and examples built with TensorFlow