Stars
The official repo for "OpenMoE 2: Sparse Diffusion Language Models".
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
A family of open-sourced Mixture-of-Experts (MoE) Large Language Models
Train transformer language models with reinforcement learning.
[NeurIPS 2025] SURDS: Benchmarking Spatial Understanding and Reasoning in Driving Scenarios with Vision Language Models
Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"
A very simple GRPO implement for reproducing r1-like LLM thinking.
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
[NeurIPS 2025] AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning
Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.
[NeurIPS 2025] MMaDA - Open-Sourced Multimodal Large Diffusion Language Models
A curated list of awesome LLM/VLM/VLA for Autonomous Driving(LLM4AD) resources (continually updated)
✨✨Latest Advances on Multimodal Large Language Models
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
OpenDriveVLA: Towards End-to-end Autonomous Driving with Large Vision Language Action Model
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
[NeurIPS 2024] Simple and Effective Masked Diffusion Language Model
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Official PyTorch implementation for "Large Language Diffusion Models"
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Official implementation of the paper "Instructions are all you need: Self-supervised Reinforcement Learning for Instruction Following"
[ICCV 2025] Official implementation of the paper: REPA-E: Unlocking VAE for End-to-End Tuning of Latent Diffusion Transformers
The author's implementation of FUDOKI, a multimodal large language model purely based on discrete flow matching.
This repository contains the official implementation of "FlowIE: Efficient Image Enhancement via Rectified Flow"
Official repository for CVPR2025 paper "Reversing Flow for Image Restoration".
An ML research template with good documentation by Boyuan Chen, an MIT PhD student