Starred repositories
The official code of "Image is All You Need to Empower Large-scale Diffusion Models for In-Domain Generation". [CVPR2025]
A unified framework for easy reinforcement learning in Flow-Matching models
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
Better Aligning Text-to-Image Models with Human Preference. ICCV 2023
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image Generation.
Official implementation of HPSv3: Towards Wide-Spectrum Human Preference Score (ICCV2025)
Data and sample evaluation codes for Multimodal Rewardbench 2
Pixel-Level Reasoning Model trained with RL [NeuIPS25]
Unlocking Iterative Reasoning for Any Image Editor
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
The first Interleaved framework for textual reasoning within the visual generation process
A survey for visual generation alignment
Paper List of Inference/Test Time Scaling/Computing
[ICLR 2026 Oral] DiffusionNFT: Online Diffusion Reinforcement with Forward Process
Edit-R1: Reinforce Image Editing with Diffusion Negative-Aware Finetuning and MLLM Implicit Feedback
[ICLR 26] TempFlow-GRPO (Temporal Flow GRPO), a principled GRPO framework that captures and exploits the temporal structure inherent in flow-based generation.
OmniGen2: Exploration to Advanced Multimodal Generation. https://arxiv.org/abs/2506.18871
EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling
Official repo for paper "EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning"