Thanks to visit codestin.com
Credit goes to github.com

Skip to content

pzhren/Surfer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Surfer: A World Model-Based Framework for Vision-Language Robot Manipulation

[arXiv] [Website].

SeaWave

SeaWave is a Robotic Manipulation with Progressive Reasoning Tasks benchmark based on a realistic robotic manipulation simulator. Specifically, the SeaWave benchmark builds a new high-fidelity digital twin scene based on Unreal Engine 5, which includes 40K natural language instructions generated by ChatGPT for a detailed evaluation of robot manipulation.

Simulator

Pipeline

pipeline

Environment

Resource Consumption

In our experiments, we used 1 NVIDIA GeForce RTX 3090 GPU. And the simulator occupies approximately 2 to 3GB of GPU memory.

Simulator

See simulator details.

Training

python src/main.py

Test

python src/eval.py

Citation

@ARTICLE{ren2025surfer,
      author={Ren, Pengzhen and Zhang, Kaidong and Zheng, Hetao and Li, Zixuan and Wen, Yuhang and Zhu, Fengda and Ma, Shikui and Liang, Xiaodan},
      journal={IEEE Transactions on Neural Networks and Learning Systems}, 
      title={Surfer: A World Model-Based Framework for Vision-Language Robot Manipulation}, 
      year={2025},
      volume={},
      number={},
      pages={1-13},
      doi={10.1109/TNNLS.2025.3594117}
}

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

About

A World Model-Based Framework for Vision-Language Robot Manipulation

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •