halspeech

Sheng Li halspeech

Ph.D. (Kyoto Univ.)--> Faculty (assist prof.), Institute of Science Tokyo, E-mail: sheng dot li at ieee dot org

18 followers · 132 following

Institute of Science Tokyo (Science Tokyo, formerly Tokyo Tech)
Tokyo, Japan
https://scholar.google.com/citations?hl=en&user=zHAhs0IAAAAJ

Highlights

Stars

Audio-WestlakeU / Rec-RIR

Official PyTorch implementation of 'Rec-RIR: Monaural Blind Room Impulse Response Identification via DNN-based Reverberant Speech Reconstruction in STFT Domain'

Python 15 2 Updated Sep 26, 2025

tltrogl / diaremot2-on

DiaRemot2-ON: CPU-only audio intelligence pipeline (Faster-Whisper, ONNX, diarization, paralinguistics)

Python 6 1 Updated Oct 26, 2025

hi-paris / CosyVoice2-EU

Europeanized CosyVoice2 for French & German

Jupyter Notebook 4 Updated Oct 16, 2025

kaistmm / AlignDiT

[ACM MM 2025] AlignDiT: Multimodal Aligned Diffusion Transformer for Synchronized Speech Generation

Python 12 1 Updated Oct 4, 2025

DanielLin94144 / Full-Duplex-Bench

A Benchmark for Evaluating Turn-Taking and Overlap Handling in Full-Duplex Spoken Dialogue Models

Python 94 4 Updated Sep 21, 2025

XIAOYixuan / IMS-ADD

IMS's Audio Deepfake Detection Toolkit

Python 5 Updated Sep 18, 2025

DwangoMediaVillage / mlsa_neural_vocoder

Python 7 3 Updated Jul 2, 2025

ludlows / PESQ

PESQ (Perceptual Evaluation of Speech Quality) Wrapper for Python Users (narrow band and wide band)

C 609 103 Updated Sep 5, 2024

gitgaviny / speechLLMs

SLM for SER

Python 1 Updated Oct 25, 2025

neuphonic / neutts-air

On-device TTS model by Neuphonic

Python 3,692 355 Updated Oct 27, 2025

MuSAELab / AUDDT

A toolkit for benchmarking on a wide variety of audio deepfake datasets.

Python 18 Updated Oct 9, 2025

thewh1teagle / heb-piper-tts-gemma-g2p-onnx

Text to speech with Hebrew G2P and TTS models based on Piper/Gemma3

Python 1 Updated Oct 4, 2025

voicekit-team / T-one

T-one is a high-performance streaming ASR pipeline for Russian, specialized for the telephony domain.

Python 207 18 Updated Jul 30, 2025

kyutai-labs / tts_longeval

Python 16 1 Updated Oct 1, 2025

angkordotdev / khmercut

Khmer Tokenizer

PHP 2 Updated May 4, 2025

microsoft / VibeVoice

Frontier Open-Source Text-to-Speech

9,763 1,234 Updated Sep 5, 2025

HeCheng0625 / Diffusion-Speech-Tokenizer

This repository contains a series of works on diffusion-based speech tokenizers, including the official implementation of the paper: "TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Lan…

Python 185 12 Updated Sep 21, 2025