Stars
DC-Gen: Post-Training Diffusion Acceleration with Deeply Compressed Latent Space
Official PyTorch Implementation of "Latent Diffusion Model Without Variational Autoencoder".
Stand-In is a lightweight, plug-and-play framework for identity-preserving video generation.
[CVPR 2025] FaithDiff for Classic Film Rejuvenation, Old Photo Revival, Social Media Restoration, Image Enhancement and AIGC Enhancement.
[Arxiv'25] IC-Custom: Diverse Image Customization via In-Context Learning
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
Qwen-Image text to image lora trainer
Code for CineScale, higher-resolution video generation based on Wan
Awesome curated collection of images and prompts generated by GPT-4o and gpt-image-1. Explore AI generated visuals created with ChatGPT and Sora, showcasing OpenAI’s advanced image generation capab…
A curated collection of fun and creative examples generated with Nano Banana🍌, Gemini-2.5-flash-image based model. We also release Nano-consistent-150K openly to support the community's development…
Scaling Diffusion Transformers with Mixture of Experts
Adapt an LLM model to a Mixture-of-Experts model using Parameter Efficient finetuning (LoRA), injecting the LoRAs in the FFN.
An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation
[CVPR 2025] Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models
Cosmos-Transfer1-DiffusionRenderer: High-quality video de-lighting and re-lighting based on Cosmos video diffusion framework
Python tool for converting files and office documents to Markdown.
Pytorch Implementation (unofficial) of the paper "Mean Flows for One-step Generative Modeling" by Geng et al.
[NeurIPS'25 Spotlight] Official repository for "Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment"
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.
📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉
[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation