Stars
Fast local TTS inference engine in C# with ONNX runtime. Multi-speaker, multi-platform and multilingual. Integrate on your .NET projects using a plug-and-play NuGet package, complete with all voices.
A modular, non-POSIX operating system for x86_64, built from scratch in C and assembly. Intended to be an educational and experimental project that rigorously follows a Plan9-style "everything is a…
Portable file server with accelerated resumable uploads, dedup, WebDAV, FTP, TFTP, zeroconf, media indexer, thumbnails++ all in one file, no deps
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Generative models for conditional audio generation
A family of state-of-the-art Transformer-based audio codecs for low-bitrate high-quality audio coding.
This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.
Keep track of big models in audio domain, including speech, singing, music etc.
Implementation of Acoustic BPE (Shen et al., 2024), extended for RVQ-based Neural Audio Codecs
Awesome speech/audio LLMs, representation learning, and codec models
High fidelity, lightweight, end-to-end, streaming, convolution-based neural audio codec
A trainer for SNAC (Multi-Scale Neural Audio Codec) has replaced the decoder with Vocos.
A multi-voice TTS system trained with an emphasis on quality
Instant voice cloning by MIT and MyShell. Audio foundation model.
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
An Open-source Streaming High-fidelity Neural Audio Codec
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Multi-Scale Neural Audio Codec (SNAC) compresses audio into discrete codes at a low bitrate
Self-Hosting Guide. Learn all about locally hosting (on premises & private web servers) and managing software applications by yourself or your organization. Including Cloud, LLMs, WireGuard, Automa…