Stars
MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models
MAGI-1: Autoregressive Video Generation at Scale
repo for paper https://arxiv.org/abs/2504.13837
Official repository of Uni-AdaFocus (TPAMI 2024).
[ECCV 2024] Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
[NeurIPS 2024] ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
[ICML 2024] SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning
[TPAMI 2024] Probabilistic Contrastive Learning for Long-Tailed Visual Recognition
[ICLR2025] Accelerating Diffusion Transformers with Token-wise Feature Caching
[ICLR 2025] Official Implementation of Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
[ECCV 2024] AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"
Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment, arXiv 2024 / CVPR 2025
This is the first Chinese chat model specifically fine-tuned for Chinese through ORPO based on the Meta-Llama-3-8B-Instruct model.
LLaVA-UHD v2: an MLLM Integrating High-Resolution Semantic Pyramid via Hierarchical Window Transformer
Official repository of Agent Attention (ECCV2024)
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models arXiv 2023 / CVPR 2024
Repository of "Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning" (NeurIPS 2023 Spotlight)
Official code of paper Understanding, Predicting and Better Resolving Q-Value Divergence in Offline-RL