Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View Seungwoo0326's full-sized avatar

Block or report Seungwoo0326

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Versatile Evaluation of Speech and Audio

Python 351 42 Updated Oct 22, 2025

g2p: English Grapheme To Phoneme Conversion

Python 886 133 Updated Jan 5, 2023

GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling

Python 152 23 Updated Feb 28, 2025

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Python 1,604 158 Updated Oct 29, 2025

Schedule-Free Optimization in PyTorch

Python 2,228 69 Updated May 21, 2025

A lightweight audio codec based on a single quantizer

Python 64 3 Updated Aug 15, 2025

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Python 615 61 Updated Jun 9, 2024

A TTS model capable of generating ultra-realistic dialogue in one pass.

Python 18,704 1,619 Updated Jul 6, 2025

The official implementation of TokenSynth (ICASSP 2025)

Python 75 3 Updated Oct 27, 2025

Official inference code for NAACL 2024 paper "R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces"

Python 2 1 Updated Mar 24, 2025

Update ASR paper everyday

Python 350 18 Updated Oct 29, 2025

Elucidating the Design Space of Diffusion-Based Generative Models (EDM)

Python 1,807 178 Updated Mar 16, 2024

A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 3,621 244 Updated Sep 25, 2025

StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation

Python 240 31 Updated Sep 13, 2024

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 10,094 966 Updated Jul 1, 2024

Official code for Wav2Seq

Python 96 13 Updated Jul 19, 2022

GRAFX: An Open-Source Library for Audio Processing Graphs in PyTorch

Python 132 7 Updated Feb 3, 2025
Python 664 25 Updated Dec 5, 2024

Implementation of SoundStorm, Efficient Parallel Audio Generation from Google Deepmind, in Pytorch

Python 1,534 92 Updated Apr 24, 2025

This repo contains the scripts, models, and required files for the Deep Noise Suppression (DNS) Challenge.

Python 1,306 438 Updated Jul 25, 2024

Official implemention for Diffusion Models Are Innate One-Step Generators

Python 25 2 Updated Jun 25, 2025

Scientific literature about Audio Effects

HTML 148 2 Updated Feb 6, 2025

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Python 542 33 Updated Oct 29, 2025

PyTorch implementation of the Perceptual Evaluation of Speech Quality for wideband audio

Python 209 15 Updated Jul 14, 2023

Conformer-based Metric GAN for speech enhancement

Python 389 66 Updated May 3, 2024
Python 30 2 Updated Jan 9, 2024

This is the official implementation of the SEMamba paper. (Accepted to IEEE SLT 2024)

Python 223 26 Updated Jun 5, 2025

Official code for the CVPR 2025 paper "SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models."

Jupyter Notebook 581 51 Updated Jun 1, 2025
Next