OmniNWM is a unified panoramic navigation world model that advances autonomous driving simulation by jointly generating multi-modal states (RGB, semantics, depth, 3D occupancy), enabling precise action control via normalized Plücker ray-maps, and facilitating closed-loop evaluation through occupancy-based dense rewards.
OmniNWM addresses three core dimensions of autonomous driving world models:
- 📊 State: Joint generation of panoramic RGB, semantic, metric depth, and 3D occupancy videos
- 🎮 Action: Precise panoramic camera control via normalized Plücker ray-maps
- 🏆 Reward: Integrated occupancy-based dense rewards for driving compliance and safety
| Feature | Description |
|---|---|
| Multi-modal Generation | Jointly generates RGB, semantic, depth, and 3D occupancy in panoramic views |
| Precise Camera Control | Normalized Plücker ray-maps for pixel-level trajectory interpretation |
| Long-term Stability | Flexible forcing strategy enables auto-regressive generation beyond GT length |
| Closed-loop Evaluation | Occupancy-based dense rewards enable realistic driving policy evaluation |
| Zero-shot Generalization | Transfers across datasets and camera configurations without fine-tuning |
- [2025/09]: Demo is released on the Project Page.
@article{li2025omninwm,
title={OmniNWM: Omniscient Driving Navigation World Models},
author={Li, Bohan and Ma, Zhuang and Du, Dalong and Peng, Baorui and Liang, Zhujin and Liu, Zhenqiang and Ma, Chao and Jin, Yueming and Zhao, Hao and Zeng, Wenjun and others},
journal={arXiv preprint arXiv:2510.18313},
year={2025}
}This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
🌟 Star us on GitHub if you find this project helpful! 🌟