Stars
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
Native Apple Silicon LLM server on MLX with chat UI, menu bar app, and CLI. OpenAI & Ollama compatible. Supports Apple Foundation Models.
SonicMaster: Towards Controllable All-in-One Music Restoration and Mastering
OCR & Document Extraction using vision models
Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs
The easiest way to install and switch between multiple versions of Xcode - with a mouse click.
real time face swap and one-click video deepfake with only a single image
๐ Awesome lists about all kinds of interesting topics
CodeGeeX4-ALL-9B, a versatile model for all AI software development scenarios, including code completion, code interpreter, web search, function calling, repository-level Q&A and much more.
Put the output from any script or program into your macOS Menu Bar (the BitBar reboot)
Multilingual Voice Understanding Model
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
V-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.
๐ A Hex Editor for Reverse Engineers, Programmers and people who value their retinas when working at 3 AM.
๐ The best real-time interactive AI avatar(digital human) with on-premise deployment and <1.5 s latency.
GeneFace++: Generalized and Stable Real-Time 3D Talking Face Generation; Official Code
Open source real-time translation app for Android that runs locally
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
StreamSpeech is an โAll in Oneโ seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
A roadmap for "generative AI" learning resources
21 Lessons, Get Started Building with Generative AI