Stars
ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
A lightweight network for body/hand action recognition
PaddleOCR inference in PyTorch. Converted from [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!
A fully working pytorch implementation of NaturalSpeech (Tan et al., 2022)
Production First and Production Ready End-to-End Speech Recognition Toolkit
A Trimap-Free Portrait Matting Solution in Real Time [AAAI 2022]