bigchou

bigchou

11 followers · 23 following

@ntu_aiailab
Room 542, CSIE Building, National Taiwan University No. 1, Sec. 4, Roosevelt Road, Da’an Dist.

Stars

spatialaudio / python-sounddevice

🔉 Play and Record Sound with Python 🐍

Python 1,211 154 Updated Dec 15, 2025

abetlen / llama-cpp-python

Python bindings for llama.cpp

Python 9,907 1,275 Updated Aug 15, 2025

NiuTrans / LMT

Building a inclusive, scalable, and high-performance multilingual translation model

Python 119 9 Updated Jan 10, 2026

FunAudioLLM / Fun-ASR

Fun-ASR is an end-to-end speech recognition large model launched by Tongyi Lab.

Python 736 55 Updated Jan 14, 2026

XMUSPEECH-Minspeech / Minspeech

A Corpus of Southern Min Dialect for Automatic Speech Recognition

Python 7 1 Updated Aug 27, 2024

Deep-Learning-101 / Speech-Processing-Paper

https://deep-learning-101.github.io/Speech-Processing Speech Processing (語音處理)

22 2 Updated Jan 4, 2026

defnngj / playwright-pro

playwright project sample

HTML 37 14 Updated Sep 19, 2022

k2-fsa / sherpa-ncnn

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, L…

C++ 1,611 203 Updated Oct 20, 2025

ouhammmourachid / mermaid-py

Python Interface for the Popular mermaid-js Library, Simplified for Diagram Creation

Python 157 13 Updated Jan 13, 2026

sandy1990418 / ChineseTaiwaneseWhisper

This repository focuses on leveraging OpenAI's Whisper model for speech recognition in Chinese (Mandarin) and Taiwanese Hokkien languages. It includes tools and scripts for data preprocessing, mode…

Python 65 11 Updated Mar 1, 2025

AmirTahaMim / RealTime-QR-Detection

This Python script to detect and decode QR codes in real-time from a live webcam feed. It is a handy tool for instant QR code scanning applications, such as inventory management and digital ticketing.

Python 6 1 Updated Aug 6, 2023

breezedeus / CnOCR

Forked from diaomin/crnn-mxnet-chinese-text-recognition

CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. 【基于 PyTor…

Python 3,722 535 Updated Sep 21, 2025

vtraag / leidenalg

Implementation of the Leiden algorithm for various quality functions to be used with igraph in Python.

Python 722 88 Updated Jan 12, 2026

nari-labs / dia

A TTS model capable of generating ultra-realistic dialogue in one pass.

Python 19,044 1,665 Updated Nov 19, 2025

JaidedAI / EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Python 28,781 3,529 Updated Dec 5, 2025

RapidAI / RapidOCR

📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.

Python 5,668 549 Updated Jan 12, 2026

dusty-nv / jetson-containers

Machine Learning Containers for NVIDIA Jetson and JetPack-L4T

Jupyter Notebook 4,160 774 Updated Jan 12, 2026

THU-MIG / yoloe

YOLOE: Real-Time Seeing Anything [ICCV 2025]

Python 1,998 188 Updated Jun 26, 2025

datawhalechina / self-llm

《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调（全参数/Lora）、部署国内外开源大模型（LLM）/多模态大模型（MLLM）教程

Jupyter Notebook 27,559 2,753 Updated Jan 14, 2026

wzpan / wukong-robot

🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首个支持脑机交互的开源智能音箱项目。

Python 7,078 1,410 Updated Oct 25, 2024

PINTO0309 / PINTO_model_zoo

A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8…

Python 4,045 630 Updated Dec 10, 2025

hacksider / Deep-Live-Cam

real time face swap and one-click video deepfake with only a single image

Python 78,260 11,413 Updated Dec 15, 2025

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Python 19,696 2,110 Updated Oct 21, 2025

juanmc2005 / diart

A python package to build AI-powered real-time audio applications

Python 1,911 154 Updated Feb 12, 2025

egruttadauria98 / SSpaVAlDo

Jupyter Notebook 35 2 Updated Jan 6, 2026

juanmc2005 / rttm-viewer

Application for viewing Rich Transcription Time Marked (RTTM) files in an interactive way

Python 48 5 Updated Apr 19, 2023

KoljaB / RealtimeSTT

A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.

Python 9,337 793 Updated Jul 11, 2025

collabora / WhisperLive

A nearly-live implementation of OpenAI's Whisper.

Python 3,743 512 Updated Jan 13, 2026

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 155,358 31,778 Updated Jan 18, 2026

wq2012 / awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

1,835 236 Updated Jul 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly