-
UT Austin
- TX, USA
-
15:21
(UTC -12:00) - https://www.linkedin.com/in/xinyu-gong-b4ab73191
Highlights
- Pro
Stars
A curated list of recent diffusion models for video generation, editing, and various other applications.
[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
An open-source implementaion for fine-tuning Qwen-VL series by Alibaba Cloud.
Official GitHub repository for FLUX.1 Krea [dev].
DiffusionNFT: Online Diffusion Reinforcement with Forward Process
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
[Preprint] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Enjoy the magic of Diffusion models!
The collection of awesome papers on alignment of diffusion models.
[CVPR 2025] HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
A unified inference and post-training framework for accelerated video generation.
My learning notes/codes for ML SYS.
Official Implementation of Paper Transfer between Modalities with MetaQueries
An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation [Siggraph Asian 2025]
APOLLO: SGD-like Memory, AdamW-level Performance; MLSys'25 Oustanding Paper Honorable Mention
[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention
MAGI-1: Autoregressive Video Generation at Scale
Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment
Lets make video diffusion practical!
[NeurIPS 2025] Improving Video Generation with Human Feedback
SkyReels-A2: Compose anything in video diffusion transformers