Cosmos-Predict2 is a collection of general-purpose world foundation models for Physical AI that can be fine-tuned into customized world models for downstream applications.

Python 652 88 Updated Oct 29, 2025

xiaomi-mlab / Orion

[ICCV 2025] Official code of "ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation"

Python 475 44 Updated Oct 9, 2025

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 61,257 7,414 Updated Oct 30, 2025

worldbench / lidarcrafter

LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences

Python 130 11 Updated Oct 13, 2025

datawhalechina / self-llm

《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调（全参数/Lora）、部署国内外开源大模型（LLM）/多模态大模型（MLLM）教程

Jupyter Notebook 25,560 2,572 Updated Oct 30, 2025

IGL-HKUST / DiffusionAsShader

[SIGGRAPH 2025] Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control

Python 763 34 Updated Jun 9, 2025

Wan-Video / Wan2.2

Wan: Open and Advanced Large-Scale Video Generative Models

Python 10,956 1,203 Updated Oct 12, 2025

worldbench / pi3det

[ICCV 2025] Perspective-Invariant 3D Object Detection

138 11 Updated Jul 24, 2025

valeoai / LOGen

Official Repository of "LOGen: Towards LiDAR Object Generation by Point Diffusion"

Python 10 1 Updated Jun 27, 2025

OpenDriveLab / DetAny3D

[ICCV 2025] Detect Anything 3D in the Wild

Python 215 11 Updated Jul 8, 2025

worldbench / evalkit

🌐 A curated evaluation toolkit and benchmark for state-of-the-art 3D and 4D world models

2 Updated Jun 21, 2025

worldbench / survey

🌐 3D and 4D World Modeling: A Survey

HTML 612 35 Updated Oct 3, 2025

ahydchh / Impromptu-VLA

Python 336 20 Updated Oct 29, 2025

robosense2025 / track4

Track 4: Cross-Modal Drone Navigation

Python 17 3 Updated Aug 28, 2025

robosense2025 / track3

Track 3: Sensor Placement

Python 19 Updated Aug 22, 2025

robosense2025 / track2

Track 2: Social Navigation

21 Updated Aug 19, 2025

robosense2025 / track1

Track 1: Driving with Language

Python 23 Updated Aug 23, 2025

ucla-mobility / AutoVLA

[NeurIPS 2025] AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning

278 8 Updated Sep 19, 2025

AlanLiang AlanLiangC

Highlights

Organizations

Lists (15)

3d gaussians

Automatic labeling tool

Autonomous Driving

Bro-Donggongong

ChatGPT

Computer Graphics

Computer Vision

Diffusion

Mamba

NeRF

Papers

PointCloud

Robots

SEE4D

Tools

Stars