Lists (2)
Sort Name ascending (A-Z)
Stars
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Official EHM Tracking Implementation for GUAVA (ICCV 2025)
Official implementation of the paper "GUAVA: Generalizable Upper Body 3D Gaussian Avatar" [ICCV 2025]
[ICLR 2024] Official implementation of "TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting"
[RSS 2025] AMO: Adaptive Motion Optimization for Hyper-Dexterous Humanoid Whole-Body Control
Official implementation of OpenWBT.
VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
Low-level locomotion policy training in Isaac Lab
[RSS'25] This repository is the implementation of "NaVILA: Legged Robot Vision-Language-Action Model for Navigation"
Official Implementation of "KungfuBot: Physics-Based Humanoid Whole-Body Control for Learning Highly-Dynamic Skills"
[IROS 2025] Generalizable Humanoid Manipulation with 3D Diffusion Policies. Part 1: Train & Deploy of iDP3
Humanoid robot arms retarget algorithm with VisionPro app
Various retargeting optimizers to translate human hand motion to robot hand motion.
PyTorch implementation for our paper Learning Character-Agnostic Motion for Motion Retargeting in 2D, SIGGRAPH 2019
[RSS 2024]: Expressive Whole-Body Control for Humanoid Robots
FantasyPortrait: Enhancing Multi-Character Portrait Animation with Expression-Augmented Diffusion Transformers
Text-audio foundation model from Boson AI
LeetCode Solutions: A Record of My Problem Solving Journey.( leetcode题解,记录自己的leetcode解题之路。)
Repository hosting code for "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).
[ACM MM'2024]"DiffMM: Multi-Modal Diffusion Model for Recommendation"
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models