Stars
Awesome speech/audio LLMs, representation learning, and codec models
Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
Awesome-LLM: a curated list of Large Language Model
[IROS 2024] Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation. [CoRL 2024] OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning
Janus-Series: Unified Multimodal Understanding and Generation Models
Python inverse kinematics using Pinocchio and QP solvers
This repository implements teleoperation of the Unitree humanoid robot using XR Devices.
A brief introduction to the quaternions and its applications in 3D geometry.
A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator
Humanoid robot arms retarget algorithm with VisionPro app
Various retargeting optimizers to translate human hand motion to robot hand motion.
Awesome work on hand pose estimation/tracking
Adversarial skill embeddings for training reusable controllers for physically simulated characters.
Official Implementation of the ICCV 2023 paper: Perpetual Humanoid Control for Real-time Simulated Avatars
VisionOS App + Python Library to stream hand tracking data from Vision Pro, video/audio stream to Vision Pro.
[RSS 2024] 3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations
A simulation platform for versatile Embodied AI research and developments.