Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View cdliang11's full-sized avatar
:dependabot:
coding
:dependabot:
coding

Block or report cdliang11

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 1,903 154 Updated Dec 21, 2025

A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)

Python 33 3 Updated Dec 19, 2025

Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.

Python 455 28 Updated Dec 19, 2025

A PyTorch-based knowledge distillation toolkit for natural language processing

Python 1,689 247 Updated May 8, 2023

A framework for efficient model inference with omni-modality models

Python 1,160 153 Updated Dec 22, 2025

The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Jupyter Notebook 670 43 Updated Dec 20, 2025

A Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows

Python 183 10 Updated Dec 17, 2025

An N-gram punctuator for Chinese and English.

Python 17 3 Updated Oct 14, 2025

Torch Audio Forced Aligner for Mixed Chinese (Mandarin or Cantonese) and English.

Python 61 3 Updated Sep 5, 2025

We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction

Python 167 11 Updated Dec 16, 2025

MiMo-Audio: Audio Language Models are Few-Shot Learners

Python 906 87 Updated Sep 20, 2025

A Large-scale Cantonese Speech Corpus with Multi-dimensional Annotation

Python 247 10 Updated Nov 30, 2025

Open-Source Frontier Voice AI

Python 18,807 2,080 Updated Dec 17, 2025

用于微调LLM的中文指令数据集

28 1 Updated Apr 12, 2023

目标:整理一份高质量的大模型古诗词数据集,涵盖先秦到现代

117 16 Updated Feb 25, 2024

alpaca中文指令微调数据集

396 24 Updated Mar 26, 2023

FlashCosyVoice: A lightweight vLLM implementation built from scratch for CosyVoice.

Python 232 25 Updated Nov 11, 2025

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 14,057 1,459 Updated Dec 19, 2025

AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。

Jupyter Notebook 5,475 761 Updated Dec 3, 2025

利用HuggingFace的官方下载工具从镜像网站进行高速下载。

Python 1,272 113 Updated Oct 12, 2024

✨✨[NeurIPS 2025] VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model

Python 667 60 Updated May 24, 2025

A curated list of resources in audio question answering and related area. :-)

7 2 Updated Jun 29, 2025

This package aims at simplifying the download of the strong version of AudioSet dataset.

Python 8 Updated Dec 16, 2024
Python 109 11 Updated Sep 18, 2025

A 6-million Audio-Caption Paired Dataset Built with a LLMs and ALMs-based Automatic Pipeline

Python 192 5 Updated Dec 13, 2024

MUSIC-AVQA, CVPR2022 (ORAL)

Python 90 9 Updated Dec 30, 2022

Efficient audio understanding with general audio captions

Python 390 39 Updated Nov 3, 2025

Text-audio foundation model from Boson AI

Python 7,760 577 Updated Sep 15, 2025
Next