-
Soochow University
- suzhou
- http://www.jianshu.com/u/52c593425488
Lists (1)
Sort Name ascending (A-Z)
Stars
Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…
An instruct text-to-speech solution based on LLaSA and CosyVoice2 developed by the ASLP lab and collaborators.
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
A framework for efficient model inference with omni-modality models
A Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows
SoulX-Podcast is an inference codebase by the Soul AI team for generating high-fidelity podcasts from text.
We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction
We Speech Transcript based on LLM, in 300 lines of code.
Text-audio foundation model from Boson AI
Production First and Production Ready End-to-End Text-to-Speech Toolkit
Text Normalization & Inverse Text Normalization
The Triton TensorRT-LLM Backend
Count the MACs / FLOPs of your PyTorch model.
Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
基于SparkTTS、OrpheusTTS等模型,提供高质量中文语音合成与声音克隆服务。
A Conversational Speech Generation Model
An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & TIS & vLLM & Ray & Async RL)
AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model