🌐 Website • 📄 Paper • 🛠️ Installation • 📊 Baselines • 🧾 Citation
RoboHiMan is a hierarchical evaluation paradigm designed to study compositional generalization in long-horizon manipulation.
It introduces HiMan-Bench, a benchmark consisting of atomic and compositional tasks under diverse perturbations, supported by a multi-level training dataset for analyzing progressive data scaling.
RoboHiMan further proposes hierarchical evaluation paradigms. The repository mainly provides three settings:
- Vanilla — model-only policy relying solely on the model's capabilities (no external planner).
- Policy + Rule-based planner — a learned policy paired with a traditional rule-based planner for subgoal sequencing.
- Policy + VLM-based planner — a learned policy guided by a vision-language-model (VLM) planner for high-level planning and subgoal generation.
These settings probe the necessity of skill composition and reveal bottlenecks in current hierarchical architectures.
Experiments highlight clear capability gaps across representative models, pointing to future directions for advancing real-world long-horizon manipulation systems.
See INSTALL.md for detailed setup instructions.
See HIGH_LEVEL.md for detailed setup instructions.
If you find this work useful, please consider citing:
@article{chen2025robohiman,
title={RoboHiMan: A hierarchical evaluation paradigm for compositional generalization in long-horizon manipulation},
author={Chen, Yangtao and Chen, Zixuan and Chan, Nga Teng and Chen, Junting and Yin, Junhui and Shi, Jieqi and Gao, Yang and Li, Yong-Lu and Huo, Jing},
journal={arXiv preprint arXiv:2510.13149},
year={2025}
}This project builds upon several excellent open-source efforts. We sincerely thank the authors of the following projects for their valuable contributions to the community:
