-
THU
- Beijing
-
19:56
(UTC -08:00) - https://operator22th.github.io/
- @Shaofeng_Yin
- https://scholar.google.com/citations?user=lpKyrxAAAAAJ&hl=en
Highlights
- Pro
Lists (10)
Sort Name ascending (A-Z)
Stars
Unofficial implementation of the Dreamer 4 world model in PyTorch.
Code, data and weights for the paper **What drives success in physical planning with Joint-Embedding Predictive World Models?**
Create motion for any robot through editing keyframe
Spirit-v1.5: A Robotic Foundation Model by Spirit AI
Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the future state of the world in the form of video.
🤖 Places where you can learn robotics (and stuff like that) online 🤖
Retargeting of whole-body human motion to humanoid robots for dexterous manipulation of articulated objects.
The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…
Team Comet's 2025 BEHAVIOR Challenge Codebase
A general physic-based retargeting framework.
A Survey on Reinforcement Learning of Vision-Language-Action Models for Robotic Manipulation
A optimized PyTorch framework for behavior cloning with flow related generative models.
Repository for our papers: Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics and Uncertainty-Aware Robotic World Model Makes Offline Model-Based Reinforceme…
Agility Meets Stability: Versatile Humanoid Control with Heterogeneous Data
[CVPR 2025] InterAct: Advancing Large-Scale Versatile 3D Human-Object Interaction Generation
UMI on Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers
We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that Sora-2 surpasses GPT5 by 10% on eyeballing puzzles and reache…
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
The repository provides code for running inference with the SAM 3D Body Model (3DB), links for downloading the trained model checkpoints and datasets, and example notebooks that show how to use the…
Pandora: Towards General World Model with Natural Language Actions and Video States
A paper list for spatial reasoning