A gradio web UI for running Large Language Models like LLaMA
A deep learning toolkit for Text-to-Speech, battle-tested in research
Audiocraft is a library for audio processing and generation
Multimodal-Driven Architecture for Customized Video Generation
JUCE is an open-source cross-platform C++ application framework
Transforming Multimodal Content into Captivating Multilingual Audio
State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX
Create music with JavaScript
Examples and guides for using the Gemini API
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Installed on a smartphone that has no connection to any network
Capable of understanding text, audio, vision, video
Multimodal Diffusion with Representation Alignment
State-of-the-art diffusion models for image and audio generation
48khz stereo neural audio codec for general audio
High-quality multi-lingual text-to-speech library by MyShell.ai
The official .NET library for the OpenAI API
The official Go library for the OpenAI API
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Swift audio synthesis, processing, & analysis platform
State-of-the-art TTS model under 25MB
Audio playback and capture library written in C,
R Package for Music Score and Audio Generation
Adversarial Robustness Toolbox (ART) - Python Library for ML security
Daptin - Backend As A Service - GraphQL/JSON-API Headless CMS