πPaper | πProject Page | π€Model | π¬WeChat Contact
β’ [2025.8.12] Updated RoboTwin2 inference code
H-RDT (Human to Robotics Diffusion Transformer) is a novel approach that leverages large-scale egocentric human manipulation data to enhance robot manipulation capabilities. Our key insight is that large-scale egocentric human manipulation videos with paired 3D hand pose annotations provide rich behavioral priors that capture natural manipulation strategies and can benefit robotic policy learning.
-
Create conda environment:
conda create -n hrdt python=3.10 conda activate hrdt
-
Install dependencies:
pip install -r requirements.txt
-
Download pre-trained models:
export HF_ENDPOINT=https://hf-mirror.com huggingface-cli download --resume-download embodiedfoundation/H-RDT --local-dir ./
Before training, preprocess the EgoDx dataset:
-
Configure paths:
# Edit datasets/pretrain/setup_pretrain.sh with your paths nano datasets/pretrain/setup_pretrain.sh # Set your EgoDx dataset and T5 model paths: export EGODEX_DATA_ROOT="/path/to/your/egodx/dataset" export T5_MODEL_PATH="/path/to/your/t5-v1_1-xxl"
-
Setup environment:
source datasets/pretrain/setup_pretrain.sh -
Run data processing pipeline:
# Automatically runs: precompute_48d_actions.py β calc_stat.py β encode_lang_batch.py ./datasets/pretrain/run_pretrain_pipeline.sh
After data preprocessing is complete:
1. EgoDx Pretrain (fresh start):
- Configure dataset:
# Edit datasets/dataset.py line ~45 self.dataset_name = "egodx"
- Run training:
bash pretrain.sh
2. Pretrain Resume:
Edit pretrain.sh, add this line:
--resume_from_checkpoint="checkpoint-450000" \Pre-computed language embeddings are already provided - no preprocessing needed!
-
Setup environment:
# Edit datasets/robotwin2/setup_robotwin2.sh if needed (only for regenerating files) source datasets/robotwin2/setup_robotwin2.sh
-
Data processing pipeline (Not Required):
# Not needed - lang_embeddings/ already provided in repository # Only run if you want to regenerate files: # ./datasets/robotwin2/run_robotwin2_pipeline.sh
- Configure dataset:
# Edit datasets/dataset.py line ~45 self.dataset_name = "robotwin_agilex" # or your robot name # Add your dataset initialization if not exists: elif self.dataset_name == "your_robot": self.hdf5_dataset = YourRobotDataset(config=config)
- Run training:
bash finetune.sh # Already configured with pretrained_backbone_path
Edit your current finetune script, make these changes:
# Change this line:
--mode="finetune" \
# To:
--mode="pretrain" \
# And add:
--resume_from_checkpoint="checkpoint-5000" \| Training Scenario | Base Script | Required Shell Script Modifications | Mode & Key Parameters |
|---|---|---|---|
| Human Pretrain (Fresh) | pretrain.sh |
--mode="pretrain" |
Start pretraining on EgoDx human data |
| Human Pretrain Resume | pretrain.sh |
Add: --resume_from_checkpoint="checkpoint-450000" \ |
--mode="pretrain" |
| Robot Fine-tuning | finetune.sh |
Change: --mode="finetune" \Add: --pretrained_backbone_path="./checkpoints/pretrain-0618/checkpoint-500000/pytorch_model.bin" \Change: --config_path="configs/hrdt_finetune.yaml" \ |
Load human pre-trained backbone, fresh action layers |
| Robot Finetune Resume | Your finetune script | Change: --mode="finetune" β --mode="pretrain"Add: --resume_from_checkpoint="checkpoint-5000" \ |
Continue robot fine-tuning |
Before training, you need to configure the dataset in datasets/dataset.py:
# In datasets/dataset.py, line ~45
self.dataset_name = "egodx"
# The EgoDxDataset will be automatically initialized# In datasets/dataset.py, line ~45
self.dataset_name = "your_robot_name" # e.g., "robotwin_agilex"
# Add your dataset to the initialization logic:
elif self.dataset_name == "your_robot_name":
self.hdf5_dataset = YourRobotDataset(
config=config,
# your dataset parameters
)- Create your dataset folder:
datasets/your_robot/ - Implement your dataset class (see
datasets/robotwin2/as example) - Create data processing scripts (see
datasets/pretrain/ordatasets/robotwin2/as examples) - Import in
datasets/dataset.py - Add initialization logic in
VLAConsumerDataset.__init__
configs/hrdt_pretrain.yaml: Human pre-training configurationconfigs/hrdt_finetune.yaml: Robot fine-tuning configurationdatasets/dataset.py: Dataset selection and initialization- Modify
state_dim,action_dim,output_sizefor your robot
Join our WeChat group to discuss H-RDT related technical issues:
For other questions or collaboration opportunities, please add personal WeChat:
Note: If the QR code expires, please contact us through project Issues for the latest contact information.