-
Shanghai Jiao Tong University
- Shanghai, China
-
00:10
(UTC +08:00) - https://zlicheng.com/
- @Colmar_zlc
Highlights
- Pro
Lists (8)
Sort Name ascending (A-Z)
Stars
Write PyTorch controllers, test them in simulation, and seamlessly transfer to real-time hardware.
Octo is a transformer-based robot policy trained on a diverse mix of 800k robot trajectories.
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model
🔥 SpatialVLA: a spatial-enhanced vision-language-action model that is trained on 1.1 Million real robot episodes. Accepted at RSS 2025.
Paper list in the survey: A Survey on Vision-Language-Action Models: An Action Tokenization Perspective
moojink / openvla-oft
Forked from openvla/openvlaFine-Tuning Vision-Language-Action Models: Optimizing Speed and Success
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning
A curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.
Kronos: A Foundation Model for the Language of Financial Markets
Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments.
Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Google Robot, WidowX+Bridge) (CoRL 2024)
Benchmarking Knowledge Transfer in Lifelong Robot Learning
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
[TPAMI 2025] Towards Visual Grounding: A Survey
[ICCV 2025] DyWA:Dynamics-adaptive World Action Model for Generalizable Non-prehensile Manipulation
Unified framework for robot learning built on NVIDIA Isaac Sim
[IROS 2025 Award Finalist] The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
📹 A more flexible framework that can generate videos at any resolution and creates videos from images.
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
Official implementation of OpenWBT.
[ICCV'25 Best Paper Finalist] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video