I am a first year PhD student in Robotics at the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) with Prof. Xingxing Zuo. Before that I obtained my master’s degree in Informatics at the Technical University of Munich (TUM) and bachelor’s degree in Computer Engineering at the Technical University of Berlin (TUB).
During my studies, I was student research assistant at the TUM Computer Vision Group, working with Prof. Daniel Cremers and Dr. Xi Wang. I also collaborate closely with Kaixin Bai and Lei Zhang at the Agile Robots. In addition, I worked extensively with Dr. Linh Kästner on several projects in social navigation for mobile robots at TUB and NUS.
My research interests focus on 3D computer vision and human/hand-object interaction, aiming to teach robots to better perceive and understand the 3D world for complex manipulation tasks.
🔥 News
- 2025.11: 🎉 Our paper on object trajectory prediction got accepted in 3DV 2026 in Vancouver, Canada.
- 2025.08: Move to Abu Dhabi and join MBZUAI as a PhD student.
- 2025.06: 🎉 Our paper on 2nd version of arena benchmark got accepted in IROS 2025 in Hangzhou, China.
- 2025.05: We organized the Arena 2025 challenge of ICRA 2025 workshop Advances in Social Navigation: Planning, HRI and Beyond.
- 2025.04: 🎉 Our paper on 5th version of arena platform got accepted in RSS 2025 in Los Angeles, USA.
- 2025.01: 🎉 Our paper on 4th version of arena platform got accepted in ICRA 2025 in Atlanta, USA.
- 2024.11: 🎉 Our paper on embedding learning for point clouds got accepted in 3DV 2025 in Singapore.
- 2024.04: 🎉 Our paper on 3rd version of arena platform got accepted in RSS 2024 in Delft, Netherlands.
- 2023.07: I join Agile Robots
as a compter vision intern in Munich.
- 2023.06: 🎉 Our paper on Arena-Rosnav 2.0 platform got accepted in IROS 2023 in Detroit, USA.
- 2023.04: 🎉 Our paper on 2D DRL-based robot navigation simulator got accepted in Ubiquitous Robots 2023 in Honolulu, USA.
- 2022.10: Move to Munich and started my master's study at TUM.
📝 Publications
GMT: Goal-Conditioned Multimodal Transformer for 6-DOF Object Trajectory Synthesis in 3D Scenes
Huajian Zeng, Abhishek Saroha, Daniel Cremers, Xi Wang
Details coming soon
Volodymyr Shcherbyna*, Linh Kästner*, Duc Anh Do, Jiaming Wang, Huu Giang Nguyen, Tim Seeger, Ahmed Martban, Zhengcheng Shen, Huajian Zeng, Nhan Trinh, Eva Wiese
[webpage]
[pdf]
[abstract]
[bibtex]
Social navigation has become increasingly important for robots operating in human environments, yet many newly proposed navigation methods remain narrowly tailored or exist only as proof-of-concept prototypes. Building on our previous work with Arena, a social navigation development platform, we now propose, Arena-Bench 2.0 a comprehensive social navigation benchmark of state-of-the-art planners, fully integrated into the Arena framework. To achieve this, we developed a novel plugin structure—implemented on ROS2—to streamline the integration process and ensure straightforward, efficient workflows. As a demonstration, we integrated various learning-based and model-based navigation approaches and constructed a diverse set of social navigation scenarios to rigorously evaluate each planner. Specifically, we introduce a scenario generation node that allows users to construct complex, realistic social contexts through a web-based interface. We subsequently perform an extensive benchmark of all integrated planners, assessing both navigational and social metrics. Our evaluation also considers factors such as sensor input, reaction time, and latency, enabling insights into which planner may be most appropriate under different circumstances. The findings offer valuable guidance for selecting suitable planners for specific scenarios.
@inproceedings{shcherbyna2025arenabench,
author={Shcherbyna, Volodymyr and Kästner, Linh and Do, Duc Anh and Wang, Jiaming and Nguyen, Huu Giang and Seeger, Tim and Martban, Ahmed and Shen, Zhengcheng and Zeng, Huajian and Trinh, Nhan and Wiese, Eva},
booktitle={2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
title={Arena-Bench 2.0: A Comprehensive Benchmark of Social Navigation Approaches in Collaborative Environments},
year={2025},
pages={9202-9209},
doi={10.1109/IROS60139.2025.11246895}
}
Volodymyr Shcherbyna*, Linh Kästner*, Duc Anh Do, Hoang Tung, Huu Giang Nguyen, Maximilian Ho-Kyoung Schreff, Tim Seeger, Eva Wiese, Ahmed Martban, Huajian Zeng, An Tran, Nguyen Quoc Hung, Jonas Kreutz, Vu Thanh Lam, Ton Manh Kien, Harold Soh
[webpage]
[pdf]
[abstract]
[bibtex]
Building upon the foundations laid by our previous work, this paper introduces Arena 5.0, the fifth iteration of our framework for robotics social navigation development and benchmarking. Arena 5.0 provides three main contributions: 1) The complete integration of NVIDIA Isaac Gym, enabling photorealistic simulations and more efficient training. It seamlessly incorporates Isaac Gym into the Arena platform, allowing the use of existing modules such as randomized environment generation, evaluation tools, ROS2 support, and the integration of planners, robot models, and APIs within Isaac Gym. 2) A comprehensive benchmark of state-of-the-art social navigation strategies, evaluated on a diverse set of generated and customized worlds and scenarios of varying difficulty levels. These benchmarks provide a detailed assessment of navigation planners using a wide range of social navigation metrics. 3) Extensive scenario generation and task planning modules for improved and customizable generation of social navigation scenarios, such as emergency and rescue situations. The platform’s performance was evaluated by generating the aforementioned benchmark and through a comprehensive user study, demonstrating significant improvements in usability and efficiency compared to previous versions.
@inproceedings{shcherbyna2025arena5,
title={Arena 5.0: A Photorealistic ROS2 Simulation Framework for Developing and Benchmarking Social Navigation},
author={Shcherbyna, Volodymyr and Kästner, Linh and Do, Duc Anh and Tung, Hoang and Nguyen, Huu Giang and Schreff, Maximilian Ho-Kyoung and Seeger, Tim and Wiese, Eva and Martban, Ahmed and Zeng, Huajian and Tran, An and Hung, Nguyen Quoc and Kreutz, Jonas and Lam, Vu Thanh and Kien, Ton Manh and Soh, Harold},
booktitle={Robotics: Science and Systems (RSS)},
year={2025}
}
CoE: Deep Coupled Embedding for Non-Rigid Point Cloud Correspondences
Huajian Zeng, Maolin Gao, Daniel Cremers
[webpage]
[pdf]
[abstract]
[bibtex]
[code]
The interest in matching non-rigidly deformed shapes represented as raw point clouds is rising due to the proliferation of low-cost 3D sensors. Yet, the task is challenging since point clouds are irregular and there is a lack of intrinsic shape information. We propose to tackle these challenges by learning a new shape representation - a per-point high dimensional embedding, in an embedding space where semantically similar points share similar embeddings. The learned embedding has multiple beneficial properties: it is aware of the underlying shape geometry and is robust to shape deformations and various shape artefacts, such as noise and partiality. Consequently, this embedding can be directly employed to retrieve high-quality dense correspondences through a simple nearest neighbor search in the embedding space. Extensive experiments demonstrate new state-of-the-art results and robustness in numerous challenging non-rigid shape matching benchmarks and show its great potential in other shape analysis tasks, such as segmentation.
@inproceedings{zeng2025coe,
title={CoE: Deep Coupled Embedding for Non-Rigid Point Cloud Correspondences},
author={Zeng, Huajian and Gao, Maolin and Cremers, Daniel},
booktitle={2025 International Conference on 3D Vision (3DV)},
pages={286–295},
year={2025},
organization={IEEE}
}
Volodymyr Shcherbyna, Linh Kästner, Diego Diaz, Huu Giang Nguyen, Maximilian Ho-Kyoung Schreff, Tim Seeger, Jonas Kreutz, Ahmed Martban, Huajian Zeng, Harold Soh
[webpage]
[pdf]
[abstract]
[bibtex]
Building upon the foundations laid by our previous work, this paper introduces Arena 4.0, a significant advancement of Arena 3.0, Arena-Bench, Arena 1.0, and Arena 2.0. Arena 4.0 provides three main novel contributions: 1) a generative-model-based world and scenario generation approach using large language models (LLMs) and diffusion models, to dynamically generate complex, human-centric environments from text prompts or 2D floorplans that can be used for development and benchmarking of social navigation strategies. 2) A comprehensive 3D model database which can be extended with 3D assets and semantically linked and annotated using a variety of metrics for dynamic spawning and arrangements inside 3D worlds. 3) The complete migration towards ROS 2, which ensures operation with state-of-the-art hardware and functionalities for improved navigation, usability, and simplified transfer towards real robots. We evaluated the platforms performance through a comprehensive user study and its world generation capabilities for benchmarking demonstrating significant improvements in usability and efficiency compared to previous versions. Arena 4.0 is openly available at https://github.com/Arena-Rosnav.
@inproceedings{shcherbyna2025arena,
title={Arena 4.0: A comprehensive ros2 development and benchmarking platform for human-centric navigation using generative-model-based environment generation},
author={Shcherbyna, Volodymyr and Kastner, Linh and Diaz, Diego and Nguyen, Huu Giang and Schreff, Maximilian Ho–Kyoung and Seeger, Tim and Kreutz, Jonas and Martban, Ahmed and Shen, Zhengcheng and Zeng, Huajian and others},
booktitle={2025 IEEE International Conference on Robotics and Automation (ICRA)},
pages={9138–9144},
year={2025},
organization={IEEE}
}
ClearDepth: Enhanced Stereo Perception of Transparent Objects for Robotic Manipulation
Kaixin Bai, Huajian Zeng, Lei Zhang, Yiwen Liu, Hongli Xu, Zhaopeng Chen, Jianwei Zhang
[webpage]
[pdf]
[abstract]
[bibtex]
[video]
Transparent object depth perception poses a challenge in everyday life and logistics, primarily due to the inability of standard 3D sensors to accurately capture depth on transparent or reflective surfaces. This limitation significantly affects depth map and point cloud-reliant applications, especially in robotic manipulation. We developed a vision transformer-based algorithm for stereo depth recovery of transparent objects. This approach is complemented by an innovative feature post-fusion module, which enhances the accuracy of depth recovery by structural features in images. To address the high costs associated with dataset collection for stereo camera-based perception of transparent objects, our method incorporates a parameter-aligned, domain-adaptive, and physically realistic Sim2Real simulation for efficient data generation, accelerated by AI algorithm. Our experimental results demonstrate the model’s exceptional Sim2Real generalizability in real-world scenarios, enabling precise depth mapping of transparent objects to assist in robotic manipulation.
@article{bai2024cleardepth,
title={ClearDepth: enhanced stereo perception of transparent objects for robotic manipulation},
author={Bai, Kaixin and Zeng, Huajian and Zhang, Lei and Liu, Yiwen and Xu, Hongli and Chen, Zhaopeng and Zhang, Jianwei},
journal={arXiv preprint arXiv:2409.08926},
year={2024}
}
Arena 3.0: Advancing Social Navigation in Collaborative and Highly Dynamic Environments
Linh Kästner, Reyk Carstens, Huajian Zeng, Jacek Kmiecik, Tuan Anh Le, Teham Bhuiyan, Boris Meinardus, Jens Lambrecht
[webpage]
[pdf]
[abstract]
[bibtex]
Building upon our previous contributions, this paper introduces Arena 3.0, an extension of Arena-Bench, Arena 1.0, and Arena 2.0. Arena 3.0 is a comprehensive software stack containing multiple modules and simulation environments focusing on the development, simulation, and benchmarking of social navigation approaches in collaborative environments. We significantly enhance the realism of human behavior simulation by incorporating a diverse array of new social force models and interaction patterns, encompassing both human-human and human-robot dynamics. The platform provides a comprehensive set of new task modes, designed for extensive benchmarking and testing and is capable of generating realistic and human-centric environments dynamically, catering to a broad spectrum of social navigation scenarios. In addition, the platform’s functionalities have been abstracted across three widely used simulators, each tailored for specific training and testing purposes. The platform’s efficacy has been validated through an extensive benchmark and user evaluations of the platform by a global community of researchers and students, which noted the substantial improvement compared to previous versions and expressed interests to utilize the platform for future research and development.
@article{kastner2024arena,
title={Arena 3.0: Advancing social navigation in collaborative and highly dynamic environments},
author={Kästner, Linh and Shcherbyna, Volodymyir and Zeng, Huajian and Le, Tuan Anh and Schreff, Maximilian Ho-Kyoung and Osmaev, Halid and Tran, Nam Truong and Diaz, Diego and Golebiowski, Jan and Soh, Harold and others},
journal={arXiv preprint arXiv:2406.00837},
year={2024}
}
Linh Kästner, Reyk Carstens, Huajian Zeng, Jacek Kmiecik, Tuan Anh Le, Teham Bhuiyan, Boris Meinardus, Jens Lambrecht
[webpage]
[pdf]
[abstract]
[bibtex]
Following up on our previous works, in this paper, we present Arena-Rosnav 2.0 an extension to our previous works Arena-Bench and Arena-Rosnav, which adds a variety of additional modules for developing and benchmarking robotic navigation approaches. The platform is fundamentally restructured and provides unified APIs to add additional functionalities such as planning algorithms, simulators, or evaluation functionalities. We have included more realistic simulation and pedestrian behavior and provide a profound documentation to lower the entry barrier. We evaluated our system by first, conducting a user study in which we asked experienced researchers as well as new practitioners and students to test our system. The feedback was mostly positive and a high number of participants are utilizing our system for other research endeavors. Finally, we demonstrate the feasibility of our system by integrating two new simulators and a variety of state of the art navigation approaches and benchmark them against one another.
@inproceedings{kastner2023arena,
title={Arena-rosnav 2.0: A development and benchmarking platform for robot navigation in highly dynamic environments},
author={Kästner, Linh and Carstens, Reyk and Zeng, Huajian and Kmiecik, Jacek and Bhuiyan, Teham and Khorsandhi, Niloufar and Shcherbyna, Volodymyir and Lambrecht, Jens},
booktitle={2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
pages={11257–11264},
year={2023},
organization={IEEE}
}
Efficient 2D Simulators for Deep-Reinforcement-Learning-based Training of Navigation Approaches
Huajian Zeng, Linh Kästner, Jens Lambrecht
[webpage]
[pdf]
[abstract]
[bibtex]
In recent years, Deep Reinforcement Learning (DRL) has emerged as a competitive approach for mobile robot navigation. However, training DRL agents often comes at the cost of difficult and tedious training procedures in which powerful hardware is required to conduct oftentimes long training runs. Especially, for complex environments, this proves to be a major bottleneck for widespread adoption of DRL approaches into industries. In this paper we integrate an efficient 2D simulator into the Arena-Rosnav framework of our previous work as an alternative simulation platform to train and develop DRL agents. Therefore, we utilized the provided API to integrate necessary components into the ecosystem of Arena-Rosnav. We evaluated our simulator by training a DRL agent within that platform and compared the training and navigational performance against the baseline 2D simulator Flatland, which is the default simulating platform of Arena-Rosnav. Results demonstrate that using our Arena2D simulator results in substantially faster training times and in some scenarios better agents. This proves to be an important step towards resource-efficient DRL training, which accelerates training times and improve the development cycle of DRL agents for navigation tasks.
@inproceedings{zeng2023efficient,
title={Efficient 2D Simulators for Deep-Reinforcement-Learning-based Training of Navigation Approaches},
author={Zeng, Huajian and Kästner, Linh and Lambrecht, Jens},
booktitle={2023 20th International Conference on Ubiquitous Robots (UR)},
pages={275–280},
year={2023},
organization={IEEE}
}
📖 Educations
- 2025.08 - 2029.07 (expected), PhD, Robotics, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi.
- 2022.09 - 2025.02, Master, Informatics, Technical University of Munich, Munich.
- 2019.10 - 2022.09, Undergraduate, Computer Engineering, Technical University of Berlin, Berlin.
🎖 Honors and Awards
- 2024.10 Deutschestipendium 2024/2025
📚 Academic Services
- Conference Reviewer: ICRA, IROS
💻 Internships
- 2023.07 - 2024.05, Agile Robots, Munich.
🤝 Volunteer Works
- 2024.06 - 2024.07, UEFA Euro 2024, Munich.