A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 16,043 3,177 Updated Nov 5, 2025

mkunes / w2v2_audioFrameClassification

wav2vec2 audio classification for prosodic boundary detection and other tasks

Jupyter Notebook 42 6 Updated Aug 11, 2023

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 152,117 31,045 Updated Nov 5, 2025

HarryVolek / PyTorch_Speaker_Verification

PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.

Python 593 164 Updated Jan 20, 2022

ranchlai / speaker-verification

Speaker verification using ResnetSE (EER=0.0093) and ECAPA-TDNN

Python 96 22 Updated Sep 15, 2021

facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 31,918 6,622 Updated Sep 30, 2025

hitachi-speech / EEND

End-to-End Neural Diarization

Python 409 63 Updated Aug 30, 2021

kaldi-asr / kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Shell 15,205 5,368 Updated Sep 22, 2025

s3prl / s3prl

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Python 2,470 516 Updated Jun 13, 2025

deepinsight / insightface

State-of-the-art 2D and 3D Face Analysis Project

Python 26,946 5,814 Updated Sep 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zbyněk Zajíc zzajic

Highlights

Block or report zzajic

Stars

BUTSpeechFIT / DiariZen

deepseek-ai / DeepSeek-V3

AI-Republic-PH / AIR_AI_Engineering_Course

wenet-e2e / wespeaker

pyannote / pyannote-metrics

pyannote / pyannote-audio

desh2608 / dover-lap

speechbrain / speechbrain

NVIDIA-NeMo / NeMo