Stars
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
Robust Speech Recognition via Large-Scale Weak Supervision
This repo is meant to serve as a detailed guide for Machine Learning/AI interviews.
My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT p…
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
YourChatGPT is a versatile AI chatbot solution that harnesses the capabilities of the ChatGPT API. Crafted to simplify your journey, it enables you to create a tailored ChatGPT clone effortlessly.
Implement MLP from Scratch using Python
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
Unofficial implement with paper SpeakerGAN: Speaker identification with conditional generative adversarial network
AIDman / TFGAN-PLC
Forked from Guanyuansheng/TFGAN-PLCA Temporal-Spectral Generative Adversarial Network based End-to-end Packet Loss Concealment for Wideband Speech Transmission
Research code for the paper "Training speaker recognition systems with limited data" at https://arxiv.org/abs/2203.14688
Baseline for the Spoofing-aware Speaker Verification Challenge 2022
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
深度学习500问,以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述,以帮助自己及有需要的读者。 全书分为15个章节,近20万字。由于水平有限,书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作,联系[email protected] 版权所有,违权必究 Tan 2018.06
PyTorch implementation of Densely Connected Time Delay Neural Network
Deezer source separation library including pretrained models.
Code for the paper: "Leveraging speaker attribute information using multi task learning for speaker verification and diarization" presented at Interspeech 2021
Deep Speaker: an End-to-End Neural Speaker Embedding System.
A library for high performance deep learning inference on NVIDIA GPUs.
Learn and L3 embedding from audio/video pairs
Audio fingerprinting and recognition in Python
Python functions for reading kaldi data formats. Useful for rapid prototyping with python.
AIDman / Mistral-Speaker-Recognition-Tutorial
Forked from imanel/Mistral-Speaker-Recognition-TutorialExperimenting Speaker Verification and Recognition with Mistral A.K.A Alize