Stars
[NeurIPS 2025] Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
A curated list of recent diffusion models for video generation, editing, and various other applications.
The official rendering library for PAG (Portable Animated Graphics) files that renders After Effects animations natively across multiple platforms.
A python module to repair invalid JSON from LLMs
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
A generative world for general-purpose robotics & embodied AI learning.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …
ECCV 2024 论文和开源项目合集,同时欢迎各位大佬提交issue,分享ECCV 2024论文和开源项目
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.
✨✨Latest Advances on Multimodal Large Language Models
An intuitive GUI for GLIGEN that uses ComfyUI in the backend
An awesome list of layout generation papers
Segment Anything in High Quality [NeurIPS 2023]
[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy
Open-Sora: Democratizing Efficient Video Production for All
Development repository for the Triton language and compiler
Official implementation of AnimateDiff.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Python wrapper to Philipp Krähenbühl's dense (fully connected) CRFs with gaussian edge potentials.
OPTango: Multi-central Representation Learning against Innumerable Compiler Optimization for Binary Diffing
Laf is a vibrant cloud development platform that provides essential tools like cloud functions, databases, and storage solutions. It enables developers to quickly unleash their creativity and bring…
ChatLaw:A Powerful LLM Tailored for Chinese Legal. 中文法律大模型
The official gpt4free repository | various collection of powerful language models | o4, o3 and deepseek r1, gpt-4.1, gemini 2.5
GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.
A curated list of awesome neural radiance fields papers