Stars
Code for: "Long-Context Autoregressive Video Modeling with Next-Frame Prediction"
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos
Code for [CVPR 2025] ROICtrl: Boosting Instance Control for Visual Generation
[NeurIPS 2024] EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.
High-resolution models for human tasks.
[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
(ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generator
CosmicMan: A Text-to-Image Foundation Model for Humans (CVPR 2024)
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
Open-Sora: Democratizing Efficient Video Production for All
AvatarStudio: High-fidelity and Animatable 3D Avatar Creation from Text
[CVPR 2024] Official repository for "MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model"
[NeurIPS 2023] XAGen: 3D Expressive Human Avatars Generation
[ICCV 2023] GETAvatar: Generative Textured Meshes for Animatable Human Avatars
[IJCV] Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation
FreeU: Free Lunch in Diffusion U-Net (CVPR2024 Oral)
MagicEdit: High-Fidelity Temporally Coherent Video Editing
MagicAvatar: Multimodal Avatar Generation and Animation
[CVPR 2024] ViT-Lens: Towards Omni-modal Representations
[NeurIPS2023] DatasetDM:Synthesizing Data with Perception Annotations Using Diffusion Models
[ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding
[CVPR2024, Highlight] Official code for DragDiffusion
The repository for paper Unsupervised Volumetric Animation
[ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
[Image 2 Text Para] Transform Image into Unique Paragraph with ChatGPT, BLIP2, OFA, GRIT, Segment Anything, ControlNet.