-
Concordia University
- Montréal, Québec, Canada
-
11:37
(UTC -04:00) - https://www.linkedin.com/in/luca-della-libera
Stars
A method that directly addresses the modality gap by aligning speech token with the corresponding text transcription during the tokenization stage.
Llama-Mimi is a speech language model that uses a unified tokenizer (Mimi) and a single Transformer decoder (Llama) to jointly model sequences of interleaved semantic and acoustic tokens.
Multi-band Frequency Reconstruction for Neural Psychoacoustic Coding
Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
[NAACL 2025] WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching
Awesome speech/audio LLMs, representation learning, and codec models
Official implementation of the paper "BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec"
Hackable and optimized Transformers building blocks, supporting a composable construction.
[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling
Faster Whisper transcription with CTranslate2
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.
Vector (and Scalar) Quantization, in Pytorch
ICASSP 2024 - Generative De-Quantization for Neural Speech Codec via Latent Diffusion.
[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Continual Learning papers list, curated by ContinualAI
Pytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)
Convert Machine Learning Code Between Frameworks
An elegant PyTorch deep reinforcement learning library.
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
A toolkit for tinyML research and deployment
Structured state space sequence models