Highlights
- Pro
Stars
DiFlow-TTS: Compact and Low-Latency Zero-Shot Text-to-Speech with Factorized Discrete Flow Matching
Awesome speech/audio LLMs, representation learning, and codec models
DiFlow-TTS delivers low-latency zero-shot TTS via discrete flow matching and factorized speech tokens. A compact, open framework for fast voice synthesis.🐙
Evaluation software used in the Text Retrieval Conference
Unsupervised phone and word segmentation using dynamic programming on self-supervised VQ features.
Pypi installable TDNN and TDNN-F layers for PyTorch based acoustic model training
FSA/FST algorithms, differentiable, with PyTorch compatibility.
Implementation for the manuscript submission "Towards Unsupervised Speech Recognition Without Pronunciation Models""
PyTorch implementations of deep reinforcement learning algorithms and environments
This is the official code release for Bayesian Flow Networks.
Reliability diagrams visualize whether a classifier model needs calibration
A playbook for systematically maximizing the performance of deep learning models.
SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.
AI powered speech denoising and enhancement
(ICASSP 2025, official code)FlowSE: Flow Matching-based Speech Enhancement
This is the official implementation of the SEMamba paper. (Accepted to IEEE SLT 2024)
Efficient 3D molecular generation with flow-matching and Semla
Official implementation of All Atom Diffusion Transformers (ICML 2025)
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation