Collage_and_GraspRotateToolUse_1080p_compressed.mp4
This repository contains the official implementation of the SimToolReal framework, which was introduced in SimToolReal: An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation. It consists of:
-
Simulation Environments: Isaac Gym environments for training and evaluation of dexterous tool manipulation policies.
-
DexToolBench: A benchmark for dexterous tool manipulation.
-
Reinforcement Learning (RL) Training: RL training algorithms for training dexterous tool manipulation policies.
-
Deployment: Policy deployment in simulation and the real world.
simtoolreal
├── assets
│ └── // Assets such as robot URDF files, object models, etc.
├── baselines
│ └── // Implementation of kinematic retargeting and fixed grasp
├── deployment
│ └── // Sim-to-real and sim-to-sim deployment of the policy
├── dextoolbench
│ ├── data
│ │ └── // DexToolBench data (needs to be downloaded)
│ ├── // Scripts for evaluating policies on DexToolBench
│ └── // Scripts for visualizing DexToolBench objects and trajectories
├── docs
│ └── // Documentation
├── isaacgymenvs
│ └── // Simulation environment for training and evaluating policies
├── pretrained_policy
│ └── // Checkpoint of the pretrained policy (needs to be downloaded)
├── recorded_data
│ └── // Interface and tools for saving, loading, and visualizing recorded data
└── rl_games
└── // RL algorithms, including PPO and SAPG
Please see the Installation documentation for more details.
Please run all commands from the root directory of this repository.For most commands, you can add --help to see the available options.
First, download the pretrained policy to pretrained_policy/.
python download_pretrained_policy.py
This will result in the following directory structure:
pretrained_policy/
├── config.yaml // Configuration file for the policy
└── model.pth // Checkpoint of the policy
Then, run the interactive evaluation script with the pretrained policy on a specific task in the DexToolBench.
python dextoolbench/eval.py \
--object_category hammer \
--object_name claw_hammer \
--task_name swing_down \
--config_path pretrained_policy/config.yaml \
--checkpoint_path pretrained_policy/model.pth \
--interactive
Interactive_Eval_cropped.mp4
The following is the full DexToolBench data structure:
# ── Full DexToolBench data structure ──────────────────────────────────────────
# {object_category: {object_name: [task_name, ...]}}
DEXTOOLBENCH_DATA_STRUCTURE: Dict[str, Dict[str, List[str]]] = {
"hammer": {
"claw_hammer": ["swing_down", "swing_side"],
"mallet_hammer": ["swing_down", "swing_side"],
},
"marker": {
"sharpie_marker": ["draw_smile", "write_c"],
"staples_marker": ["draw_smile", "write_c"],
},
"eraser": {
"flat_eraser": ["wipe_smile", "wipe_c"],
"handle_eraser": ["wipe_smile", "wipe_c"],
},
"brush": {
"blue_brush": ["sweep_forward", "sweep_right"],
"red_brush": ["sweep_forward", "sweep_right"],
},
"spatula": {
"flat_spatula": ["serve_plate", "flip_over"],
"spoon_spatula": ["serve_plate", "flip_over"],
},
"screwdriver": {
"long_screwdriver": ["spin_vertical", "spin_horizontal"],
"short_screwdriver": ["spin_vertical", "spin_horizontal"],
},
}
See dextoolbench/objects.py and assets/urdf/dextoolbench/<object_category>/<object_name>/<object_name>.urdf for more details about the objects.
See dextoolbench/trajectories for the list of task names following the directory structure dextoolbench/trajectories/<object_category>/<object_name>/<task_name>.json, which is the output of dextoolbench/process_poses.py. These .json files are poses specified in world frame.
Training logs are tracked with Weights & Biases. Before training, log in and update the wandb_entity in isaacgymenvs/launch_training.py to your own WandB entity:
wandb login
To train a policy from scratch, run the following command:
python isaacgymenvs/launch_training.py \
--custom_experiment_name my_experiment
To finetune a trained policy, run the following command:
python isaacgymenvs/launch_training.py \
--custom_experiment_name my_finetuning_experiment \
--checkpoint <checkpoint_path>
For example:
python isaacgymenvs/launch_training.py \
--custom_experiment_name my_finetuning_experiment \
--checkpoint pretrained_policy/model.pth
If you run out of GPU memory, you can reduce the number of environments by setting --num_envs to a smaller number. Note that num_envs must be divisible by num_blocks (default 6).
python isaacgymenvs/launch_training.py \
--custom_experiment_name my_finetuning_experiment_12288 \
--checkpoint pretrained_policy/model.pth \
--num_envs 12288
To list all available options, run:
python download_dextoolbench_data.py --list
To download the data for a specific task, run:
python download_dextoolbench_data.py \
--object_category hammer \
--object_name claw_hammer \
--task_name swing_down
To download the data for a specific object, run:
python download_dextoolbench_data.py \
--object_category hammer \
--object_name claw_hammer
To download the data for a specific category, run:
python download_dextoolbench_data.py \
--object_category hammer
To download all data, run:
python download_dextoolbench_data.py
For each task, it will download the data into the dextoolbench/data/<object_category>/<object_name>/<task_name>/ directory with the following structure:
dextoolbench/data/<object_category>/<object_name>/<task_name>/
├── cam_K.txt // Camera intrinsics
├── depth // Depth images
├── masks // Object masks
├── poses.json // Object poses in robot frame
└── rgb // RGB images
To visualize 1 demo:
python dextoolbench/visualize_demo.py \
--object_category hammer \
--object_name claw_hammer \
--task_name swing_down
VisualizeDemo_github_3.mp4
See dextoolbench/objects.py for the list of object models.
To visualize a DexToolBench object:
python dextoolbench/visualize_object.py \
--urdf_path assets/urdf/dextoolbench/hammer/claw_hammer/claw_hammer.urdf
To visualize all DexToolBench objects:
python dextoolbench/visualize_all_objects.py
To visualize training objects:
python dextoolbench/generate_training_objects.py
python dextoolbench/visualize_training_objects.py
To visualize a DexToolBench task trajectory:
python dextoolbench/visualize_task.py \
--object_category hammer \
--object_name claw_hammer \
--task_name swing_down
To visualize all DexToolBench task trajectories:
python dextoolbench/visualize_all_tasks.py
Visualize_All_Tasks_cropped.mp4
To manually create a task trajectory:
python dextoolbench/interactive_create_task_trajectory.py \
--object_category hammer \
--object_name claw_hammer \
--task_name my_new_task
To numerically evaluate a trained policy on DexToolBench:
python dextoolbench/run_all_evals.py
Use this to manually adjust the position and orientation of the object's origin frame, as well as the object's scale.
python dextoolbench/interactive_adjust_object.py \
--mesh_path assets/urdf/dextoolbench/hammer/claw_hammer/claw_hammer.obj \
--output_dir assets/urdf/dextoolbench/hammer/new_claw_hammer
See data_collection_and_processing.md for more details.
For Sim2Real policy deployment, we need to run the following nodes:
- RL Policy Node: Takes in observations, runs policy to get raw actions, converts to joint position targets, and publishes these targets.
- Goal Pose Node: Stores a sequence of goal poses, takes in object pose, updates current goal pose if dist(goal, object) < threshold, and publishes the current goal pose.
- Perception Node: Takes in RGB-D images, uses SAM and FoundationPose to get object pose, and publishes these poses.
- Robot Node: Sends joint position targets to robot and publishes joint states.
(1) and (2) are in this repo, but (3) and (4) are not in this repo.
The following is the Sim2Real deployment flowchart:
flowchart TD
subgraph Perception ["Perception System"]
Cam["RGB-D Camera"]
PN["Perception Node<br/>(SAM + FoundationPose)"]
end
subgraph Policy ["Control System"]
RL["RL Policy Node"]
GPN["Goal Pose Node"]
end
subgraph Hardware ["Physical Robot"]
RN["Robot Node<br/>(IIWA Arm + Sharpa Hand)"]
end
%% Perception Connections
Cam -- "RGB-D Images" --> PN
PN -- "/robot_frame/<br/>current_object_pose" --> RL
PN -- "/robot_frame/<br/>current_object_pose" --> GPN
%% Goal Pose Node Connections
GPN -- "/robot_frame/<br/>goal_object_pose" --> RL
%% Robot State Connections (Feedback)
RN -- "/iiwa/<br/>joint_states" --> RL
RN -- "/sharpa/<br/>joint_states" --> RL
%% Policy Command Connections (Actions)
RL -- "/iiwa/<br/>joint_cmd" --> RN
RL -- "/sharpa/<br/>joint_cmd" --> RN
%% Styling
classDef node fill:#f9f9f9,stroke:#333,stroke-width:2px;
classDef topic fill:#e1f5fe,stroke:#0288d1,stroke-width:1px;
Before testing the policy in the real world, we can test it in simulation using a similar setup to the real world. For Sim2Sim policy deployment, we need to run the following nodes:
- RL Policy Node: (same as above)
- Goal Pose Node: (same as above)
- Simulation Node: Takes in joint position targets, runs the simulation environment and publishes the simulation state (robot state and object pose). Replaces the robot node and perception node.
(1) and (2) are in this repo, but (3) and (4) are not in this repo.
The following is the Sim2Sim deployment flowchart (the Simulation Node at the top and bottom are the same node, but separated in the diagram for clarity/symmetry with the Sim2Real deployment flowchart):
flowchart TD
subgraph SimPerception ["Simulated Perception"]
SPN["Simulation Node<br/>(Simulates Perception)"]
end
subgraph Policy ["Control System"]
RL["RL Policy Node"]
GPN["Goal Pose Node"]
end
subgraph SimHardware ["Simulated Robot"]
SRN["Simulation Node<br/>(Simulates Robot)"]
end
%% Perception Connections
SPN -- "/robot_frame/<br/>current_object_pose" --> RL
SPN -- "/robot_frame/<br/>current_object_pose" --> GPN
%% Goal Pose Node Connections
GPN -- "/robot_frame/<br/>goal_object_pose" --> RL
%% Robot State Connections (Feedback)
SRN -- "/iiwa/<br/>joint_states" --> RL
SRN -- "/sharpa/<br/>joint_states" --> RL
%% Policy Command Connections (Actions)
RL -- "/iiwa/<br/>joint_cmd" --> SRN
RL -- "/sharpa/<br/>joint_cmd" --> SRN
%% Styling
classDef node fill:#f9f9f9,stroke:#333,stroke-width:2px;
classDef topic fill:#e1f5fe,stroke:#0288d1,stroke-width:1px;
We also use a Visualization Node that subscribes to relevant ROS topics and renders a 3D scene using Viser. This is very useful for debugging and visualization. This node only subscribes and does not publish any topics (read-only).
flowchart LR
subgraph Inputs ["Subscribed ROS Topics"]
direction TB
JS_I["/iiwa/joint_states"]
JS_S["/sharpa/joint_states"]
JC_I["/iiwa/joint_cmd"]
JC_S["/sharpa/joint_cmd"]
OP_C["/robot_frame/current_object_pose"]
OP_G["/robot_frame/goal_object_pose"]
end
VVN["Visualization Node"]
subgraph Output ["User Interface"]
GUI["Viser 3D Web Interface"]
end
%% Input Connections
JS_I --> VVN
JS_S --> VVN
JC_I --> VVN
JC_S --> VVN
OP_C --> VVN
OP_G --> VVN
%% Output Connection
VVN -- "Renders Scene" --> GUI
%% Styling
classDef node fill:#f9f9f9,stroke:#333,stroke-width:2px;
classDef topic fill:#e1f5fe,stroke:#0288d1,stroke-width:1px;
class JS_I,JS_S,JC_I,JC_S,OP_C,OP_G topic;
If running in real world, run the following nodes (not included in this repo):
# Arm
roslaunch iiwa_control joint_position_control.launch
# Hand
python deployment/sharpa_node.py
# Perception
python live_tracking_with_ros.py
If running in simulation, run the following:
python deployment/isaac/isaac_env_node.py \
--object_category hammer \
--object_name claw_hammer \
--task_name swing_down
To test this pipeline without running an actual physics simulation, you can replace the Simulation Node with (1) a fake robot node that simply interpolates to the joint position targets and (2) a fake perception node that simply publishes a fixed object pose:
python deployment/fake/fake_robot_node.py
python deployment/fake/fake_perception_node.py
First, start the Visualization Node:
python deployment/visualization_node.py \
--object_name claw_hammer
Next, home the robot:
python deployment/home_robot.py
To run the Goal Pose Node, run:
python deployment/goal_pose_node.py \
--object_category hammer \
--object_name claw_hammer \
--task_name swing_down
To run the RL Policy Node, run:
python deployment/rl_policy_node.py \
--policy_path pretrained_policy \
--object_name claw_hammer
To run an open-loop replay of a joint position trajectory:
python deployment/replay_trajectory.py \
--file_path <file_path>
For example:
python deployment/replay_trajectory.py \
--file_path recorded_robot_inputs/2026-02-17_testing/2026-02-17_02-33-12_model_arm0.1_claw_hammer.npz
See baselines.md for more details.
When rl_policy_node.py is running, it will record observation data and save this to a file upon exiting. You can visualize this data using the following script:
python recorded_data/visualize.py \
--file_path <file_path>
For example:
python recorded_data/visualize.py \
--file_path recorded_robot_inputs/2026-02-17_testing/2026-02-17_02-33-12_model_arm0.1_claw_hammer.npz
VisualizeRecordedData_cropped.mp4
Python files:
./format_pys.sh
URDF files:
./format_urdfs.sh
This implementation builds upon the following codebases:
@misc{kedia2026simtoolrealobjectcentricpolicyzeroshot,
title={SimToolReal: An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation},
author={Kushal Kedia and Tyler Ga Wei Lum and Jeannette Bohg and C. Karen Liu},
year={2026},
eprint={2602.16863},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2602.16863}
}
If you have any questions, issues, or feedback, please contact Tyler Lum or Kushal Kedia.