Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model
video.mp4
ALL code details and guidance will be released in one month
We evluate X-VLA across 6 simulations, which encompass hundreds of evaluation setups, spanning single-arm, bi-manual robotic systems, autonomous driving, and assessing diverse axes of generalization, including cross-embodiment, cross-environment, and cross-task adaptation.
| Simpler | Libero | Calvin | RoboTwin_2.0 | VLABench | NAVSIM | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| VM | VM | WidowX | Spatial | Object | Goal | Long | Avg | ABC→D | Easy | Hard | Avg. PS | PDMS |
| 80.4 | 75.7 | 95.8 | 98.2 | 98.6 | 97.8 | 97.6 | 98.1 | 4.43 | 70.0 | 39.0 | 51.1 | 87.3 |
More detailed metrics for each benchmark are available in the following figures. Click to view: Robotics Simulation and Autonomous Driving.
Following π₀, we adopt a server-client setup for simulation evaluations. Specifically, we run the policy and the simulation environment in separate Python processes, using a network-based server-client setup to enable communication between them. The policy acts as the server, while the simulation environment queries it as a client.
To start the server, run the following commands:
conda activate xvla
bash scripts/depoly.sh
Next, run the evaluation scripts to test the policy. To evaluate across different simulations, make sure to set up the corresponding environments: Libero, Simpler, Calvin, VLABench, and NAVSIM.
For example, to evaluate the policy on the Libero benchmark, you can run the following commands after installing Libero in a separate Conda environment:
conda activate libero
bash eval/libero/client.sh
To enable the use of the Abs EEF as the control interface on LIBERO, we replay the dataset to obtain the corresponding actions:
for action in actions:
obs, reward, done, info = env.step(action)
abs_pos = env.env.robots[0].controller.goal_pos
abs_ori = env.env.robots[0].controller.goal_oriFor evaluation, the controller needs to be set to absolute control mode:
env.reset()
for robot in env.env.robots:
robot.controller.use_delta=False