[ACL 2024] Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

Python 1,037 79 Updated Dec 23, 2024

ga642381 / speech-trident

Awesome speech/audio LLMs, representation learning, and codec models

1,199 74 Updated Aug 13, 2025

lmxue / Audio-FLAN

Audio-FLAN

Jupyter Notebook 160 5 Updated Sep 23, 2025

Yuan-ManX / ai-audio-datasets

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…

889 86 Updated Jul 8, 2025

modelscope / 3D-Speaker

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Python 2,731 241 Updated Dec 8, 2025

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 9,657 788 Updated May 27, 2025

kyutai-labs / moshi

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 9,322 844 Updated Jan 8, 2026

AudioLLMs / Awesome-Audio-LLM

Audio Large Language Models

Python 851 43 Updated Jul 5, 2025

jingyaogong / minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT！🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 37,558 4,491 Updated Jan 18, 2026

mlabonne / llm-datasets

Curated list of datasets and tools for post-training.

4,173 342 Updated Nov 10, 2025

mlabonne / llm-course

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

73,389 8,425 Updated Dec 22, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 25,828 2,411 Updated Nov 24, 2025

OpenBMB / MiniCPM-V

MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

Python 22,636 1,712 Updated Sep 24, 2025

modelscope / ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 3,833 313 Updated Aug 14, 2025

coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 44,274 5,918 Updated Aug 16, 2024

lenML / Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

Python 1,377 182 Updated Sep 16, 2025

bmcfee / pyrubberband

python wrapper for rubberband

Python 211 26 Updated Sep 30, 2024

datajuicer / data-juicer

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Python 5,744 317 Updated Jan 16, 2026

xinyu1205 / recognize-anything

Open-source and strong foundation image recognition models.

Jupyter Notebook 3,566 316 Updated Feb 18, 2025

facebookresearch / faiss

A library for efficient similarity search and clustering of dense vectors.

C++ 38,788 4,190 Updated Jan 16, 2026

VILA-Lab / ATLAS

A principled instruction benchmark on formulating effective queries and prompts for large language models (LLMs). Our paper: https://arxiv.org/abs/2312.16171

Python 980 102 Updated May 28, 2024

vietanhdev / anylabeling

Effortless AI-assisted data labeling with AI support from YOLO, Segment Anything (SAM+SAM2), MobileSAM!!

Python 3,150 329 Updated Jan 6, 2026

pyecharts / pyecharts-gallery

Just use pyecharts to imitate Echarts official example.

HTML 1,423 608 Updated Aug 4, 2025

MikeGu721 / CS_arxiv_everyweek

Weekly update the Computer Science Paper upload to arxiv.

JavaScript 106 1 Updated Jan 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

twang flyingmrwang

Block or report flyingmrwang

Stars

dreamtheater123 / Awesome-SpeechLM-Survey

yt-dlp / yt-dlp

Plachtaa / seed-vc

OpenBMB / RLPR

NanoNets / docext

RVC-Boss / GPT-SoVITS

ddlBoJack / emotion2vec