Stars
MotionAgent is your AI assistent to convert ideas into motion pictures.
在本项目中模拟健康档案私有知识库构建和检索全流程,通过一份代码实现了同时支持多种大模型(如OpenAI、阿里通义千问等)的RAG(检索增强生成)功能:(1)离线步骤:文档加载->文档切分->向量化->灌入向量数据库;在线步骤:获取用户问题->用户问题向量化->检索向量数据库->将检索结果和用户问题填入prompt模版->用最终的prompt调用LLM->由LLM生成回复
A pipeline parallel training script for diffusion models.
YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open
The official code repository for LeVo: High-Quality Song Generation with Multi-Preference Alignment
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
MOSS-TTSD is a spoken dialogue generation model that enables expressive dialogue speech synthesis in both Chinese and English, supporting zero-shot multi-speaker voice cloning, and long-form speech…
StyleGAN-Human: A Data-Centric Odyssey of Human Generation
[ICCV 2025] VisualCloze: A universal image generation framework that can support a wide range of in-domain tasks and generalize to unseen ones. (🔥 🔥 🔥 Merged into offical pipelines of diffusers.)
基于Cosyvoice2-0.5B模型实现的多角色语音克隆项目,使用flet开发,支持多音色管理、历史记录管理、一键克隆,仅需短短几秒的人声音频即可快速生成。
[ICCV 2025] 🔥🔥 UNO: A Universal Customization Method for Both Single and Multi-Subject Conditioning
SoftVC VITS Singing Voice Conversion
Text Normalization & Inverse Text Normalization
a machine learning image inpainting task that instinctively removes watermarks from image indistinguishable from the ground truth image
[ECCV2024] IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild
[AAAI 2025]👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing. It enables customizable human image generation with flexible garment, pose, and scene control, ensuring high …
🪞 Instant AI Face Swap 一键 AI 换脸,发现更美的你
官方推荐的 ChatTTS 资源汇总项目,整理了全网相关资源和常见问题 || Officially recommended ChatTTS resource collection project
Automatically remove the mosaics in images and videos, or add mosaics to them.
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
OpenMMLab Pre-training Toolbox and Benchmark
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …
ECCV2020 paper "Whole-Body Human Pose Estimation in the Wild"
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, No-code agent builder, MCP compatibility, and more.
LLM Frontend for Power Users.