Mobi-π: Mobilizing Your Robot Learning Policy

Jingyun Yang, Isabella Huang*, Brandon Vu*, Max Bajracharya, Rika Antonova, Jeannette Bohg

1 Introduction

We introduce the "policy mobilization" problem: find a mobile robot base pose in a novel environment that is in distribution with respect to a manipulation policy trained on a limited set of camera viewpoints.

To study policy mobilization, we introduce the Mobi-π framework, which includes: (1) metrics that quantify the difficulty of mobilizing a given policy, (2) a suite of simulated mobile manipulation tasks based on RoboCasa to evaluate policy mobilization, (3) visualization tools for analysis, and (4) several baseline methods. We also propose a novel approach that bridges navigation and manipulation by optimizing the robot's base pose to align with an in-distribution base pose for a learned policy.

This repository includes the following contents:

Pre-trained manipulation checkpoints for the 5 simulated benchmark tasks.
Instructions for training new manipulation policies.
Implementation of our proposed method and two baselines (BC w/ Nav and LeLaN).
Instructions for training and evaluating our proposed method.
Instructions for running baselines with provided checkpoints.
Visualization tools.
Plotting scripts used in the paper.

1.1 Installation

Install our repo with the following script:

conda create -c conda-forge -n mobipi python=3.10
conda activate mobipi
chmod +x install.sh
./install.sh

If you wish to setup custom directories for your data and checkpoints, we recommend setting up macros. Run the following script: python -m mobipi.scripts.setup_macros. This should create a mobipi/macros_private.py. In this private macros file, edit the following constants:

SCENE_MODEL_ROOT_DIR = [insert your selected directory for 3D Gaussian Splatting models]
POLICY_CKPT_ROOT_DIR = [insert your selected directory for reading policy checkpoints]
LOG_ROOT_DIR = [insert your selected directory for evaluation logging]
DATA_ROOT_DIR = [insert your selected directory for policy training data]

Troubleshooting Tips

If you get any pytorch-related errors, make sure (1) your torch version is 2.1.1 and supports your GPU; (2) your numpy version is 1.23.5; (3) your timm version is 1.0.12
If you get any errors that look like the following: 'NoneType' object has no attribute 'CameraModelType', or if your script gets stuck forever at gsplat: Setting up CUDA with MAX_JOBS=10, try reinstalling the gsplat library following instructions in this GitHub issue.

1.2 Code Overview

Here is the outline of prominent components of the codebase:

external/: third-party libraries such as diffusion-policy and robocasa.
mobipi/:
- scene_model/: scripts for collecting images for training a 3D Gaussian Splatting model and interfacing with a trained model.
- nav/: scripts for generating LeLaN fine-tuning episodes and interfacing with a trained LeLaN checkpoint.
- eval/: scripts for evaluating competing methods and retrieving evaluation result statistics.
- utils/: various utility scripts, including code for computing score functions, dealing with I/O, processing media contents, and loading policy checkpoints.
- vis/: stand-alone scripts for visualizing method performances and plotting result figures.
- scripts/: utility script to setup macros.

2 Use Cases

2.1 Manipulation Policies

To run any method for policy mobilization, we need to prepare the manipulation policies to be mobilized. There are two ways to obtain them: (1) download our pre-trained checkpoint for the 5 benchmark tasks or (2) train a new checkpoint on your own.

Downloading Existing Policy Checkpoints: You can download pre-trained manipulation policies using the following script: python mobipi/scripts/download_pi.py. The script will walk you through selecting the models you wish to download.

Training Policies from Scratch

To train a policy in RoboCasa from scratch, you will need to first download a dataset generated by MimicGen, filter it so we use only the training scene split, and then train the policy.

To begin, navigate into external/robocasa/robocasa/macros_private.py and set the DATASET_BASE_PATH to the same directory as your specified dataset root directory in the mobipi macros file. Download the MimicGen dataset using the following script (you can switch the task name in the command line arguments):

python robocasa/scripts/download_datasets.py --ds_types mg_im --tasks CloseSingleDoor

Then, assuming that the data is downloaded in directory ~/robocasa, run the following script to filter the dataset. Make sure you are using the robocasa installation in the git submodules (in external), not the original robocasa library. We use layouts 0, 2, 3, 5, 6 for training and 1, 4, 7, 8, 9 for testing. This split is selected by looking through all layouts and balancing the appearances of different room shapes between the training and test splits.

OMP_NUM_THREADS=8 MPI_NUM_THREADS=8 MKL_NUM_THREADS=8 OPENBLAS_NUM_THREADS=8 python robocasa/scripts/dataset_states_to_obs.py --dataset ~/robocasa/datasets/v0.1/single_stage/kitchen_doors/CloseSingleDoor/mg/2024-05-04-22-34-56/demo_gentex_im128_randcams.hdf5 --filter_layouts 0,2,3,5,6 --n 300

Finally, set up policy learning with the following steps:

Navigate to external/robomimic/scripts; run python setup_macros.py
Navigate to external/robomimic; edit macros_private.py to configure wandb and experiment data directories
Retrieve training commands by running python robomimic/scripts/config_gen/gen_door.py --name CloseSingleDoor --n_seeds 3. You can change the script name gen_door and command line arguments --name to switch from one environment to another.
Then, simply execute the training commands printed out by the previous script execution to run policy training.

2.2 Scene Models

Our proposed method assumes the availability of 3D Gaussian Splatting models of the scene. There are two ways to obtain 3D Gaussian Splatting models: (1) download a 3D Gaussian Splatting model that we already trained for the benchmark; (2) train one of your own.

Downloading Existing 3D Gaussian Splatting Checkpoints: We train a Gaussian Splatting model for each task and scene. To download one or more models, run the following script: python mobipi/scripts/download_scene_models.py. The script will walk you through selecting the models you wish to download.

Train 3D Gaussian Splatting Models from Scratch: To obtain 3DGS models of scenes in the sim benchmark, navigate to mobipi/scene_model and run the following script. You can switch environments and scene IDs via command line arguments. The test-time scenes have IDs 1, 4, 7, 8, 9. This script will generate necessary data for training the 3DGS model and train the model. After the execution completes, find the generated scene_data directory and move contents inside it to your specified SCENE_MODEL_ROOT_DIR.

python collect_images_batch.py --env_names CloseDrawer --style_ids 1,4,7,8,9

2.3 Running the Given Manipulation Policies

To evaluate the performance of a given manipulation policy, make sure you have completed Section 2.1 of the readme. Then, navigate into mobipi/eval/ and run the following script:

python eval_baseline.py --env_name CloseSingleDoor --layout_id -1 --seed 1 \
    --baseline_name vanilla_policy --randomize_base_init_pose 0.00

Here, you can switch out CloseSingleDoor to any other environment (aka. task) name. In our evaluation setup, we keep layout_id and style_id the same (layout_id = 1, styld_id = 1, layout_id = 4, style_id = 4, etc.). You can also specify --layout_id -1 and drop the --style_id argument to test in all scenes in the evaluation split. If you wish to test a different layout and style combination, you can train scene models for that specific combination before evaluation.

The --randomize_base_init_pose option allows you to test policy performance at varying base pose offsets from

2.4 Running Our Method

To run our method, make sure you have completed Sections 2.1 and 2.2 in this readme. Navigate into mobipi/eval/ in this repository, and then run the following script:

python eval_mobipi.py --env_name CloseSingleDoor --scene_ids 1 --seed 1 --vis

In this command, --env_name sets the environment name. You can select among the following environments: CloseSingleDoor, CloseDrawer, TurnOnMicrowave, TurnOnSinkFaucet, TurnOnStove. The --scene_id argument sets the (layout_id, style_id) setup for evaluation (RoboCasa maintains a set of different room layouts and styles). In our experiment setup, we always keep layout and style IDs the same. You can set the scene ID among numbers 0 to 9. To test in all evaluation layouts, set --scene_ids to -1.

2.5 Running LeLaN

To run LeLaN, make sure you have completed Sections 2.1 and 2.2 in this readme. First, download the fine-tuned LeLaN checkpoint here. Then, run the LeLaN baseline by navigating into mobipi/eval/ and running the following script:

python eval_baseline.py --env_name CloseSingleDoor --layout_id 1 --style_id 1 --seed 1 \
    --baseline_name lelan --check_collisions --lelan_ckpt_path [insert checkpoint path]

As explained in Section 2.2, you can switch to different environments, layouts, and styles using the command line arguments. The script will run LeLaN for navigation for a maximum of 500 steps and then switch to running the manipulation policy. Results will be saved into the log directory.

2.6 Running BC w/ Nav

To run BC w/ Nav, make sure you have completed Section 2.1 of in this readme. First, download the policy checkpoints:

python mobipi/scripts/download_pi.py --nav

Then, run evaluation for this method by navigating into mobipi/eval/ and running the following script:

python eval_baseline.py --env_name CloseSingleDoor --layout_id 1 --style_id 1 --seed 1 \
    --baseline_name il_nav --horizon 800

Similar to running the LeLaN baseline, you can switch to any other environment, layout, and style with the command line arguments. The script will run the BC w/ Nav baseline with an episode horizon of 800. This horizon value is set to be larger than the length of all training demos to ensure there is enough time for task completion.

2.7 Collecting Evaluation Results

To collect evaluation statistics, navigate into mobipi/eval/ and run python compute_success_rate.py [your log root directory]/[task name]/[method name]/bc_xfmr. The script will compute and print out mean and stadard deviation of success rates for the specified environment and method.

2.8 Metrics

In the paper, we introduced metrics for learning about mobilization feasibility. Here, we show how to compute the spatial mobilization feasibility metric.

To compute the spatial metric, we first need to know the success rate of the given manipulation policy at different base pose deviations. To obtain this statistics, walk through Section 2.3 to run policy evaluation for your desired task across all evaluation scenes with --randomize_base_init_pose set to the following values: 0.0, 0.05, 0.10, 0.15, 0.20, 0.25, 0.30. Then, walk through Section 2.7 to collect evaluation results for all these settings. Finally, navigate into mobipi/vis/, fill in this dictionary with your collected mean success rate values and run the following script:

python mobipi/vis/plot_spatial_metric.py

2.9 Visualization Tools

We also provide visualization tools to inspect scenes and method performances.

Visualize Topdown Maps of A Scene: Run python mobipi/vis/plot_topdown.py.
Visualize Navigation Targets Selected by Baselines and Our Method: Edit this line, this line, and this line to adjust environment and method names. Then, use mobipi/vis/plot_nav_poses.py to produce a plot.
Visualize Animation of Optimization Process in Blender: first, make sure to have completed at least one evaluation episode using our method. Then, navigate to mobipi/vis/blender, follow instructions at the top of the script to run replay_episode.py. This should create a subdirectory called success in your replay output directory. In this success subdirectory, you will see folders with names replay_*/. Note down the path to one of these folders and use it to run render_episode.py, following the instructions at the top of this file.

2.10 Plotting Results

Interested in producing a result plot similar to ours? Check out mobipi/vis/plot_sim_table.py and mobipi/vis/plot_real_table.py.

3 License

This codebase is licensed under the terms of the MIT License.

4 Acknowledgements

The simulation code is based on RoboCasa, RoboMimic, and MimicGen.
Code from LeLaN is used in baseline implementations.
Neural rendering implementation is based on NerfStudio.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
external		external
mobipi		mobipi
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Mobi-π: Mobilizing Your Robot Learning Policy

1 Introduction

1.1 Installation

1.2 Code Overview

2 Use Cases

2.1 Manipulation Policies

2.2 Scene Models

2.3 Running the Given Manipulation Policies

2.4 Running Our Method

2.5 Running LeLaN

2.6 Running BC w/ Nav

2.7 Collecting Evaluation Results

2.8 Metrics

2.9 Visualization Tools

2.10 Plotting Results

3 License

4 Acknowledgements

About

Uh oh!

Releases

Packages

Languages

License

Alpslee/mobipi

Folders and files

Latest commit

History

Repository files navigation

Mobi-π: Mobilizing Your Robot Learning Policy

1 Introduction

1.1 Installation

1.2 Code Overview

2 Use Cases

2.1 Manipulation Policies

2.2 Scene Models

2.3 Running the Given Manipulation Policies

2.4 Running Our Method

2.5 Running LeLaN

2.6 Running BC w/ Nav

2.7 Collecting Evaluation Results

2.8 Metrics

2.9 Visualization Tools

2.10 Plotting Results

3 License

4 Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages