Stars
Official Repo For "BindWeave: Subject-Consistent Video Generation via Cross-Modal Integration"
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse–Linear Attention
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
Scaling Diffusion Transformers with Mixture of Experts
Krea Realtime 14B. An open-source realtime AI video model.
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
A pipeline parallel training script for diffusion models.
🔥 A minimal training framework for scaling FLA models
🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"
AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。
verl: Volcano Engine Reinforcement Learning for LLMs
An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation
📹 A more flexible framework that can generate videos at any resolution and creates videos from images.
You like pytorch? You like micrograd? You love tinygrad! ❤️
Video-R1: Reinforcing Video Reasoning in MLLMs [🔥the first paper to explore R1 for video]
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
The official implementation of Diffusion-KTO: Aligning Diffusion Models by Optimizing Human Utility
Code for: "Long-Context Autoregressive Video Modeling with Next-Frame Prediction"
Fully open reproduction of DeepSeek-R1
[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing
No fortress, purely open ground. OpenManus is Coming.
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
Collect every awesome work about r1!
Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’
Wan: Open and Advanced Large-Scale Video Generative Models
[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
[TMLR 2025🔥] A survey for the autoregressive models in vision.