GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scene

ICCV 2025
Xiao Chen Tai Wang Quanyi Li Tao Huang Jiangmiao Pang Tianfan Xue
The Chinese University of Hong Kong Shanghai AI Laboratory

📋 Contents

🏠 About

Generalizable active mapping in complex unknown environments remains a critical challenge for mobile robots. Existing methods, constrained by insufficient training data and conservative exploration strategies, exhibit limited generalizability across scenes with diverse layouts and complex connectivity. To enable scalable training and reliable evaluation, we introduce GLEAM-Bench, the first large-scale benchmark designed for generalizable active mapping with 1,152 diverse 3D scenes from synthetic and real-scan datasets. Building upon this foundation, we propose GLEAM, a unified generalizable exploration policy for active mapping. Its superior generalizability comes mainly from our semantic representations, long-term navigable goals, and randomized strategies. It significantly outperforms state-of-the-art methods, achieving 66.50% coverage (+9.49%) with efficient trajectories and improved mapping accuracy on 128 unseen complex scenes.

📊 Dataset

GLEAM-Bench includes 1,152 diverse 3D scenes from synthetic and real-scan datasets for benchmarking generalizable active mapping policies. These curated scene meshes are characterized by near-watertight geometry, diverse floorplan (≥10 types), and complex interconnectivity. We unify these multi-source datasets through filtering, geometric repair, and task-oriented preprocessing. Please refer to the guide for more details and scrips.

We provide all the preprocessed data used in our work, including mesh files (in obj folder), ground-truth surface points (in gt folder) and asset indexing files (in urdf folder). We recommend users fill out the form to access the download link [HERE]. The directory structure should be as follows.

GLEAM
├── README.md
├── gleam
│   ├── train
│   ├── test
│   ├── ...
├── data_gleam
│   ├── README.md
│   ├── train_stage1_512
│   │   ├── gt
│   │   ├── obj
│   │   ├── urdf
│   ├── train_stage2_512
│   │   ├── gt
│   │   ├── obj
│   │   ├── urdf
│   ├── eval_128
│   │   ├── gt
│   │   ├── obj
│   │   ├── urdf
├── ...

🛠️ Installation

We test our code under the following environment:

NVIDIA RTX 3090/4090 (24GB VRAM)
NVIDIA Driver: 545.29.02
Ubuntu 20.04
CUDA 11.8
Python 3.8.12
PyTorch 2.0.0+cu118

Clone this repository.

git clone https://github.com/zjwzcx/GLEAM
cd GLEAM

Create an environment and install PyTorch.

conda create -n gleam python=3.8 -y
conda activate gleam
pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 torchaudio==2.0.1+cu118 --extra-index-url https://download.pytorch.org/whl/cu118

NVIDIA Isaac Gym Installation: https://developer.nvidia.com/isaac-gym/download

cd isaacgym/python
pip install -e .

Install GLEAM.

pip install -r requirements.txt
pip install -e .

🕹️ Training & Evaluation

Weights & Bias (wandb) is highly recommended for analyzing the training logs. If you want to use wandb in our codebase, please paste your wandb API key into wandb_utils/wandb_api_key_file.txt. If you don't want to use wandb, please add --stop_wandb into the following command.

We provide the standard checkpoints of GLEAM [HERE]. Please use the 40k-step checkpoint as the standard. We also provide the Stage 2 checkpoints excluding the 96 Gibson scenes, as this exclusion made the model more robust and stable overall.

Training

Please run the following command to reproduce the standard two-stage training of GLEAM.

Stage 1 with 512 scenes:

python gleam/train/train_gleam_stage1.py --sim_device=cuda:0 --num_envs=32 --headless

Stage 2 with additional 512 scenes, continually trained based on the pretrained checkpoint (specified by --ckpt_path) from stage 1. Take our released checkpoint as example, ckpt_path should be runs/train_gleam_stage1/models/rl_model_40000000_steps.zip.

python gleam/train/train_gleam_stage2.py --sim_device=cuda:0 --num_envs=32 --headless --ckpt_path=${YOUR_CKPT_PATH}$

Customized Training Environments

If you want to customize a novel training environment, you need to create your environment and configuration files in gleam/env and then define the task in gleam/__init__.py.

Evaluation

Please run the following command to evaluate the generalization performance of GLEAM on 128 unseen scenes from the test set of GLEAM-Bench. The users should specify the checkpoint via --ckpt_path.

python gleam/test/test_gleam_gleambench.py --sim_device=cuda:0 --num_envs=32 --headless --stop_wandb --ckpt_path=${YOUR_CKPT_PATH}$

Main Results

📝 TODO List

Release GLEAM-Bench (dataset) and the arXiv paper in May 2025.
Release the training code in May 2025.
Release the evaluation code in June 2025.
Release the key scripts in June 2025.
Release the pretrained checkpoint in June 2025.

🤔 FAQ

Q: Is it normal that the program gets stuck for about 5-60 minutes during training and testing?
A: This is normal because the simulation environment needs to load 1024 complex 3D scenes (for training) or 128 complex 3D scenes (for evaluation) at once, which is very time-consuming. For initial use, it is recommended to modify the hardcoded parameters (number of training scenes for stage1 and number of evaluation scenes) to reduce the number of loaded scenes for a quick run-through.

Q: Is it normal that the 3D scenes in the visualization UI have no textures?
A: This is normal. Textures have been removed from the preprocessed data to speed up simulation and rendering, as RGB/texture information is not required for geometry-level exploration. It's a trade-off to accelerate training, allowing focus on 3D spatial exploration. If you need scenes with textures, we recommend downloading the raw version of GLEAM-Bench. Please refer to the guide for more details.

🔗 Citation

If you find our work helpful, please cite it:

@article{chen2025gleam,
  title={GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scenes},
  author={Chen, Xiao and Wang, Tai and Li, Quanyi and Huang, Tao and Pang, Jiangmiao and Xue, Tianfan},
  journal={arXiv preprint arXiv:2505.20294},
  year={2025}
}

If you use our codebase, dataset, and benchmark, please kindly cite the original datasets involved in our work. BibTex entries are provided below.

Dataset BibTex

@article{ai2thor,
  author={Eric Kolve and Roozbeh Mottaghi and Winson Han and
          Eli VanderBilt and Luca Weihs and Alvaro Herrasti and
          Daniel Gordon and Yuke Zhu and Abhinav Gupta and
          Ali Farhadi},
  title={{AI2-THOR: An Interactive 3D Environment for Visual AI}},
  journal={arXiv},
  year={2017}
}

@inproceedings{chen2024gennbv,
  title={GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction},
  author={Chen, Xiao and Li, Quanyi and Wang, Tai and Xue, Tianfan and Pang, Jiangmiao},
  year={2024}
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
}

@inproceedings{rudin2022learning,
  title={Learning to walk in minutes using massively parallel deep reinforcement learning},
  author={Rudin, Nikita and Hoeller, David and Reist, Philipp and Hutter, Marco},
  booktitle={Conference on Robot Learning},
  pages={91--100},
  year={2022},
  organization={PMLR}
}

@inproceedings{procthor,
  author={Matt Deitke and Eli VanderBilt and Alvaro Herrasti and
          Luca Weihs and Jordi Salvador and Kiana Ehsani and
          Winson Han and Eric Kolve and Ali Farhadi and
          Aniruddha Kembhavi and Roozbeh Mottaghi},
  title={{ProcTHOR: Large-Scale Embodied AI Using Procedural Generation}},
  booktitle={NeurIPS},
  year={2022},
  note={Outstanding Paper Award}
}

@inproceedings{xiazamirhe2018gibsonenv,
  title={Gibson Env: real-world perception for embodied agents},
  author={Xia, Fei and R. Zamir, Amir and He, Zhi-Yang and Sax, Alexander and Malik, Jitendra and Savarese, Silvio},
  booktitle={Computer Vision and Pattern Recognition (CVPR), 2018 IEEE Conference on},
  year={2018},
  organization={IEEE}
}

@article{khanna2023hssd,
    author={Khanna*, Mukul and Mao*, Yongsen and Jiang, Hanxiao and Haresh, Sanjay and Shacklett, Brennan and Batra, Dhruv and Clegg, Alexander and Undersander, Eric and Chang, Angel X. and Savva, Manolis},
    title={{Habitat Synthetic Scenes Dataset (HSSD-200): An Analysis of 3D Scene Scale and Realism Tradeoffs for ObjectGoal Navigation}},
    journal={arXiv preprint},
    year={2023},
    eprint={2306.11290},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

@article{Matterport3D,
  title={Matterport3D: Learning from RGB-D Data in Indoor Environments},
  author={Chang, Angel and Dai, Angela and Funkhouser, Thomas and Halber, Maciej and Niessner, Matthias and Savva, Manolis and Song, Shuran and Zeng, Andy and Zhang, Yinda},
  journal={International Conference on 3D Vision (3DV)},
  year={2017}
}

📄 License

This work is under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
assets		assets
data_gleam		data_gleam
gleam		gleam
legged_gym		legged_gym
licenses		licenses
resources/robots/drone		resources/robots/drone
rsl_rl		rsl_rl
stable_baselines3		stable_baselines3
wandb_utils		wandb_utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scene

📋 Contents

🏠 About

📊 Dataset

🛠️ Installation

🕹️ Training & Evaluation

Training

Customized Training Environments

Evaluation

Main Results

📝 TODO List

🤔 FAQ

🔗 Citation

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

zjwzcx/GLEAM

Folders and files

Latest commit

History

Repository files navigation

GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scene

📋 Contents

🏠 About

📊 Dataset

🛠️ Installation

🕹️ Training & Evaluation

Training

Customized Training Environments

Evaluation

Main Results

📝 TODO List

🤔 FAQ

🔗 Citation

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages