Stars
Code for "DrivingWorld: Constructing World Model for Autonomous Driving via Video GPT"
RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
The best OSS video generation models, created by Genmo
Code for "Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation"
A comprehensive list of papers about Robot Manipulation, including papers, codes, and related websites.
GIM: Learning Generalizable Image Matcher From Internet Videos (ICLR 2024 Spotlight)
A Survey of Embodied Learning for Object-Centric Robotic Manipulation
A 3DGS framework for omni urban scene reconstruction and simulation.
[CVPR 2023] Official repository for downloading, processing, visualizing, and training models on the ARCTIC dataset.
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"
SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement
[AAAI 2025] EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
The repo for "Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image" and "Metric3Dv2: A Versatile Monocular Geometric Foundation Model..."
RaDe-GS: Rasterizing Depth in Gaussian Splatting
[SIGGRAPH Asia'24 & TOG] Gaussian Opacity Fields: Efficient Adaptive Surface Reconstruction in Unbounded Scenes
[AAAI 2025] Official implementation of "OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on"
🧙🏻♂️A list of papers curated for you to dive into the Awesome Radiance Field-based 3D Editing.
[ECCV'24] GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image
[CVPR 2024 Oral] Rethinking Inductive Biases for Surface Normal Estimation
[ICML2024] Official code for GaussianPro: 3D Gaussian Splatting with Progressive Propagation