Jingyun Yang, Isabella Huang*, Brandon Vu*, Max Bajracharya, Rika Antonova, Jeannette Bohg
We introduce the "policy mobilization" problem: find a mobile robot base pose in a novel environment that is in distribution with respect to a manipulation policy trained on a limited set of camera viewpoints.
To study policy mobilization, we introduce the Mobi-π framework, which includes: (1) metrics that quantify the difficulty of mobilizing a given policy, (2) a suite of simulated mobile manipulation tasks based on RoboCasa to evaluate policy mobilization, (3) visualization tools for analysis, and (4) several baseline methods. We also propose a novel approach that bridges navigation and manipulation by optimizing the robot's base pose to align with an in-distribution base pose for a learned policy.
This repository includes the following contents:
- Pre-trained manipulation checkpoints for the 5 simulated benchmark tasks.
- Instructions for training new manipulation policies.
- Implementation of our proposed method and two baselines (BC w/ Nav and LeLaN).
- Instructions for training and evaluating our proposed method.
- Instructions for running baselines with provided checkpoints.
- Visualization tools.
- Plotting scripts used in the paper.
Install our repo with the following script:
conda create -c conda-forge -n mobipi python=3.10
conda activate mobipi
chmod +x install.sh
./install.sh
If you wish to setup custom directories for your data and checkpoints, we recommend setting up macros. Run the following script: python -m mobipi.scripts.setup_macros. This should create a mobipi/macros_private.py. In this private macros file, edit the following constants:
SCENE_MODEL_ROOT_DIR = [insert your selected directory for 3D Gaussian Splatting models]
POLICY_CKPT_ROOT_DIR = [insert your selected directory for reading policy checkpoints]
LOG_ROOT_DIR = [insert your selected directory for evaluation logging]
DATA_ROOT_DIR = [insert your selected directory for policy training data]
Troubleshooting Tips
- If you get any pytorch-related errors, make sure (1) your
torchversion is2.1.1and supports your GPU; (2) yournumpyversion is1.23.5; (3) yourtimmversion is1.0.12 - If you get any errors that look like the following:
'NoneType' object has no attribute 'CameraModelType', or if your script gets stuck forever atgsplat: Setting up CUDA with MAX_JOBS=10, try reinstalling thegsplatlibrary following instructions in this GitHub issue.
Here is the outline of prominent components of the codebase:
external/: third-party libraries such asdiffusion-policyandrobocasa.mobipi/:scene_model/: scripts for collecting images for training a 3D Gaussian Splatting model and interfacing with a trained model.nav/: scripts for generating LeLaN fine-tuning episodes and interfacing with a trained LeLaN checkpoint.eval/: scripts for evaluating competing methods and retrieving evaluation result statistics.utils/: various utility scripts, including code for computing score functions, dealing with I/O, processing media contents, and loading policy checkpoints.vis/: stand-alone scripts for visualizing method performances and plotting result figures.scripts/: utility script to setup macros.
To run any method for policy mobilization, we need to prepare the manipulation policies to be mobilized. There are two ways to obtain them: (1) download our pre-trained checkpoint for the 5 benchmark tasks or (2) train a new checkpoint on your own.
Downloading Existing Policy Checkpoints: You can download pre-trained manipulation policies using the following script: python mobipi/scripts/download_pi.py. The script will walk you through selecting the models you wish to download.
Training Policies from Scratch
To train a policy in RoboCasa from scratch, you will need to first download a dataset generated by MimicGen, filter it so we use only the training scene split, and then train the policy.
To begin, navigate into external/robocasa/robocasa/macros_private.py and set the DATASET_BASE_PATH to the same directory as your specified dataset root directory in the mobipi macros file. Download the MimicGen dataset using the following script (you can switch the task name in the command line arguments):
python robocasa/scripts/download_datasets.py --ds_types mg_im --tasks CloseSingleDoor
Then, assuming that the data is downloaded in directory ~/robocasa, run the following script to filter the dataset. Make sure you are using the robocasa installation in the git submodules (in external), not the original robocasa library. We use layouts 0, 2, 3, 5, 6 for training and 1, 4, 7, 8, 9 for testing. This split is selected by looking through all layouts and balancing the appearances of different room shapes between the training and test splits.
OMP_NUM_THREADS=8 MPI_NUM_THREADS=8 MKL_NUM_THREADS=8 OPENBLAS_NUM_THREADS=8 python robocasa/scripts/dataset_states_to_obs.py --dataset ~/robocasa/datasets/v0.1/single_stage/kitchen_doors/CloseSingleDoor/mg/2024-05-04-22-34-56/demo_gentex_im128_randcams.hdf5 --filter_layouts 0,2,3,5,6 --n 300
Finally, set up policy learning with the following steps:
- Navigate to
external/robomimic/scripts; runpython setup_macros.py - Navigate to
external/robomimic; editmacros_private.pyto configure wandb and experiment data directories - Retrieve training commands by running
python robomimic/scripts/config_gen/gen_door.py --name CloseSingleDoor --n_seeds 3. You can change the script namegen_doorand command line arguments--nameto switch from one environment to another. - Then, simply execute the training commands printed out by the previous script execution to run policy training.
Our proposed method assumes the availability of 3D Gaussian Splatting models of the scene. There are two ways to obtain 3D Gaussian Splatting models: (1) download a 3D Gaussian Splatting model that we already trained for the benchmark; (2) train one of your own.
Downloading Existing 3D Gaussian Splatting Checkpoints: We train a Gaussian Splatting model for each task and scene. To download one or more models, run the following script: python mobipi/scripts/download_scene_models.py. The script will walk you through selecting the models you wish to download.
Train 3D Gaussian Splatting Models from Scratch: To obtain 3DGS models of scenes in the sim benchmark, navigate to mobipi/scene_model and run the following script. You can switch environments and scene IDs via command line arguments. The test-time scenes have IDs 1, 4, 7, 8, 9. This script will generate necessary data for training the 3DGS model and train the model. After the execution completes, find the generated scene_data directory and move contents inside it to your specified SCENE_MODEL_ROOT_DIR.
python collect_images_batch.py --env_names CloseDrawer --style_ids 1,4,7,8,9
To evaluate the performance of a given manipulation policy, make sure you have completed Section 2.1 of the readme. Then, navigate into mobipi/eval/ and run the following script:
python eval_baseline.py --env_name CloseSingleDoor --layout_id -1 --seed 1 \
--baseline_name vanilla_policy --randomize_base_init_pose 0.00
Here, you can switch out CloseSingleDoor to any other environment (aka. task) name. In our evaluation setup, we keep layout_id and style_id the same (layout_id = 1, styld_id = 1, layout_id = 4, style_id = 4, etc.). You can also specify --layout_id -1 and drop the --style_id argument to test in all scenes in the evaluation split. If you wish to test a different layout and style combination, you can train scene models for that specific combination before evaluation.
The --randomize_base_init_pose option allows you to test policy performance at varying base pose offsets from
To run our method, make sure you have completed Sections 2.1 and 2.2 in this readme. Navigate into mobipi/eval/ in this repository, and then run the following script:
python eval_mobipi.py --env_name CloseSingleDoor --scene_ids 1 --seed 1 --vis
In this command, --env_name sets the environment name. You can select among the following environments: CloseSingleDoor, CloseDrawer, TurnOnMicrowave, TurnOnSinkFaucet, TurnOnStove. The --scene_id argument sets the (layout_id, style_id) setup for evaluation (RoboCasa maintains a set of different room layouts and styles). In our experiment setup, we always keep layout and style IDs the same. You can set the scene ID among numbers 0 to 9. To test in all evaluation layouts, set --scene_ids to -1.
To run LeLaN, make sure you have completed Sections 2.1 and 2.2 in this readme. First, download the fine-tuned LeLaN checkpoint here. Then, run the LeLaN baseline by navigating into mobipi/eval/ and running the following script:
python eval_baseline.py --env_name CloseSingleDoor --layout_id 1 --style_id 1 --seed 1 \
--baseline_name lelan --check_collisions --lelan_ckpt_path [insert checkpoint path]
As explained in Section 2.2, you can switch to different environments, layouts, and styles using the command line arguments. The script will run LeLaN for navigation for a maximum of 500 steps and then switch to running the manipulation policy. Results will be saved into the log directory.
To run BC w/ Nav, make sure you have completed Section 2.1 of in this readme. First, download the policy checkpoints:
python mobipi/scripts/download_pi.py --nav
Then, run evaluation for this method by navigating into mobipi/eval/ and running the following script:
python eval_baseline.py --env_name CloseSingleDoor --layout_id 1 --style_id 1 --seed 1 \
--baseline_name il_nav --horizon 800
Similar to running the LeLaN baseline, you can switch to any other environment, layout, and style with the command line arguments. The script will run the BC w/ Nav baseline with an episode horizon of 800. This horizon value is set to be larger than the length of all training demos to ensure there is enough time for task completion.
To collect evaluation statistics, navigate into mobipi/eval/ and run python compute_success_rate.py [your log root directory]/[task name]/[method name]/bc_xfmr. The script will compute and print out mean and stadard deviation of success rates for the specified environment and method.
In the paper, we introduced metrics for learning about mobilization feasibility. Here, we show how to compute the spatial mobilization feasibility metric.
To compute the spatial metric, we first need to know the success rate of the given manipulation policy at different base pose deviations. To obtain this statistics, walk through Section 2.3 to run policy evaluation for your desired task across all evaluation scenes with --randomize_base_init_pose set to the following values: 0.0, 0.05, 0.10, 0.15, 0.20, 0.25, 0.30. Then, walk through Section 2.7 to collect evaluation results for all these settings. Finally, navigate into mobipi/vis/, fill in this dictionary with your collected mean success rate values and run the following script:
python mobipi/vis/plot_spatial_metric.py
We also provide visualization tools to inspect scenes and method performances.
- Visualize Topdown Maps of A Scene: Run
python mobipi/vis/plot_topdown.py. - Visualize Navigation Targets Selected by Baselines and Our Method: Edit this line, this line, and this line to adjust environment and method names. Then, use
mobipi/vis/plot_nav_poses.pyto produce a plot. - Visualize Animation of Optimization Process in Blender: first, make sure to have completed at least one evaluation episode using our method. Then, navigate to
mobipi/vis/blender, follow instructions at the top of the script to runreplay_episode.py. This should create a subdirectory calledsuccessin your replay output directory. In thissuccesssubdirectory, you will see folders with namesreplay_*/. Note down the path to one of these folders and use it to runrender_episode.py, following the instructions at the top of this file.
Interested in producing a result plot similar to ours? Check out mobipi/vis/plot_sim_table.py and mobipi/vis/plot_real_table.py.
This codebase is licensed under the terms of the MIT License.
- The simulation code is based on RoboCasa, RoboMimic, and MimicGen.
- Code from LeLaN is used in baseline implementations.
- Neural rendering implementation is based on NerfStudio.