Highlights
- Pro
Stars
rCM: SOTA Diffusion Distillation & Few-Step Video Generation
Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"
[ICLR 2025] Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
Pytorch implementation of MeanFlow on ImageNet and CIFAR10
A mini-library for training consistency models.
GoatWu / Self-Forcing-Plus
Forked from guandeh17/Self-ForcingUnofficial extension implementation of Self-Forcing to support I2V && 14B training.
Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
A pipeline parallel training script for diffusion models.
Elucidating the Design Space of Diffusion-Based Generative Models (EDM)
[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)
The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
[CVPR 2025 Oral] Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
Benchmarking physical understanding in generative video models
[CVPR 2025 Oral]Infinity ∞ : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
[ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
[ECCV 2024, Oral] FMBoost: Boosting Latent Diffusion with Flow Matching
[CVPR2025] PyTorch-based reimplementation of CrossFlow, as proposed in 'Flowing from Words to Pixels: A Noise-Free Framework for Cross-Modality Evolution'
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
✨✨Latest Advances on Multimodal Large Language Models