Lists (6)
Sort Name ascending (A-Z)
Stars
π A collection of utilities for LeRobot.
A Modular Toolkit for Robot Kinematic Optimization
π€ LeRobot: Making AI for Robotics more accessible with end-to-end learning
Official code repository of paper "D(R, O) Grasp: A Unified Representation of Robot and Object Interaction for Cross-Embodiment Dexterous Grasping"
Universal Monocular Metric Depth Estimation
[NeurIPS 2025] CogVLA: Cognition-Aligned Vision-Language-Action Models via Instruction-Driven Routing & Sparsification
[CoRL 2025] ManiFlow: A General Robot Manipulation Policy via Consistency Flow Training
[IROS 2025] Generalizable Humanoid Manipulation with 3D Diffusion Policies. Part 1: Train & Deploy of iDP3
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
[arXiv 2025] GMR: General Motion Retargeting. Retarget human motions into diverse humanoid robots in real time on CPU. Retargeter for TWIST.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites
AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation
[CVPR 2024 Highlight] Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3) for Visual Robotic Manipulation
[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
NVIDIA Isaac GR00T N1.6 - A Foundation Model for Generalist Robots.
RoboBrain 2.0: Advanced version of RoboBrain. See Better. Think Harder. Do Smarter. πππ
Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots
[RSS 2025] Gripper Keypose and Object Pointflow as Interfaces for Bimanual Robotic Manipulation
[RSS25] Official implementation of DemoGen: Synthetic Demonstration Generation for Data-Efficient Visuomotor Policy Learning
DexWild: Dexterous Human Interactions for In-the-Wild Robot Policies
Official PyTorch Implementation of Unified Video Action Model (RSS 2025)
[CoRL 24 Oral] D^3Fields: Dynamic 3D Descriptor Fields for Zero-Shot Generalizable Rearrangement
GLM-4.6V/4.5V/4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
Metric depth estimation from a single image
[ICCV 2025] Detect Anything 3D in the Wild