Thanks to visit codestin.com
Credit goes to github.com

Skip to content

wjkang-furiosa/diffusion-study

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 

Repository files navigation

Resources for Diffusion Models

Consistency Models

Consistency Models ICML 2023 #CM
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference Preprint #LCM
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module Preprint
Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion ICLR 2024 #CTM
Phased Consistency Model Preprint #PCM

Text-to-Image

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs ICML 2024 #RPG

Stability AI

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis ICLR 2024 #SDXL
Adversarial Diffusion Distillation Preprint #SDXL-Turbo
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis ICML 2024 #SD3
Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation Preprint #SD3-Turbo

Huawei

PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis ICLR 2024
PixArt-δ: Fast and Controllable Image Generation with Latent Consistency Models Preprint
PixArt-ÎŁ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation Preprint

Text-to-Video / Image-to-Video

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets Preprint #SVD
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation Preprint
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models Preprint
DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors Preprint
Latte: Latent Diffusion Transformer for Video Generation Preprint
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning ICLR 2024
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation Preprint
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text Preprint
Open-Sora-Plan
Open-Sora
MiniSora
VGen

Fast Inference

DeepCache: Accelerating Diffusion Models for Free CVPR 2024
FreeU: Free Lunch in Diffusion U-Net CVPR 2024
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models CVPR 2024
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation Preprint

Digital Human

Animation

MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model CVPR 2024
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance Preprint
Follow-Your-Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos AAAI 2024
MuseV
MusePose

Virtual Try-on

OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on Preprint
Magic Clothing: Controllable Garment-Driven Image Synthesis Preprint
IDM-VTON: Improving Diffusion Models for Authentic Virtual Try-on in the Wild Preprint

Talking Head

DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models Preprint
V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation Preprint
MuseTalk

Autoregressive Models

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction Preprint #VAR
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation Preprint

Personalized

IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models Preprint
Anydoor: Zero-shot object-level image customization Preprint
Style Aligned Image Generation via Shared Attention CVPR 2024

Video Editing

TokenFlow: Consistent Diffusion Features for Consistent Video Editing ICLR 2024

Others

Text-to-3d / Image-to-3d

SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion Preprint #SV3D
Wonder3D: Single Image to 3D using Cross-Domain Diffusion CVPR 2024
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior ICLR 2024
Shap-E: Generating Conditional 3D Implicit Functions Preprint
Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model Preprint
Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors ICLR 2024

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published