Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View LSimon95's full-sized avatar

Block or report LSimon95

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

DDN: A novel generative model with simple principles and unique properties. (ICLR 2025)

Python 162 8 Updated Aug 18, 2025

Speech Human Evaluation Estimation Toolkit (SHEET)

Python 120 10 Updated Oct 2, 2025

Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.

Jupyter Notebook 2,737 152 Updated Oct 9, 2025

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 7,160 649 Updated Oct 23, 2025

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

C 1 Updated Jul 12, 2025

✨ Split text by languages (e.g. 你喜欢看アニメ吗 -> 你喜欢看 | アニメ | 吗) for NLP tasks (e.g. parse, TTS). Powered by fasttext and budoux

Jupyter Notebook 65 8 Updated Sep 18, 2025

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.

Python 4,936 373 Updated Apr 21, 2025

Ultimate Vocal Remover Inference CLI

Python 92 10 Updated Feb 5, 2025

[SIGGRAPH Asia 2024, Journal Track] ToonCrafter: Generative Cartoon Interpolation

Python 5,914 520 Updated Mar 19, 2025

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 11,829 893 Updated Sep 30, 2025

XiaoGua RTC Firmware is a firmware for ESP32 chip, which is designed by zideai.com.

C 19 5 Updated Mar 14, 2025

Modern audio compression for the internet.

C 2,796 707 Updated Oct 22, 2025

An MCP-based chatbot | 一个基于MCP的聊天机器人

C++ 20,650 4,158 Updated Oct 24, 2025

A generative world for general-purpose robotics & embodied AI learning.

Python 27,451 2,523 Updated Oct 25, 2025

[ICCV 2025] SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

Python 301 7 Updated Dec 29, 2024
Python 786 73 Updated Jun 7, 2024

A lightweight muji-moe chatbot created by Reecho.ai.

C++ 12 Updated Oct 1, 2024

Acoustic Echo Canceller for Mobile Module Port From WebRTC

C 205 96 Updated Apr 28, 2025

AEC3 Extracted From WebRTC

C++ 189 90 Updated Feb 24, 2022

AI-based Audio Watermarking Tool

Python 286 39 Updated Jan 7, 2024

WebRTC Library for IoT/Embedded Device using C

C 1,336 228 Updated Sep 29, 2025

zero-shot voice conversion & singing voice conversion, with real-time support

Python 3,339 390 Updated Apr 20, 2025

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 9,027 816 Updated Oct 15, 2025

STM32 extension for working with STM32 and CubeMX in VSCode

C 248 31 Updated Sep 2, 2025

Styled banners for your Readme made with html/css in SVG !!

JavaScript 371 50 Updated Oct 13, 2020

AI powered speech denoising and enhancement

Python 2,017 241 Updated Dec 3, 2024

pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation

Python 6,660 983 Updated Nov 5, 2022

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 60,977 10,765 Updated Oct 25, 2025

Implementation for MatMul-free LM.

Python 3,032 196 Updated Jul 21, 2025
Next