-
Huazhong University of Science and Technology
- WuHan, China
-
03:42
(UTC +08:00)
Stars
[ICCV2025] II-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene Forecasting
Reference PyTorch implementation and models for DINOv3
Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image
[ICCV 2023] VAD: Vectorized Scene Representation for Efficient Autonomous Driving
[NeurIPS 2025] LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS
GASP: Unifying Geometric and Semantic Self-Supervised Pre-training for Autonomous Driving
Code for "Scaling Language-Free Visual Representation Learning" paper (Web-SSL).
Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
[ICCV 2025] SAM4D: Segment Anything in Camera and LiDAR Streams
[ACMMM 2025] Officially implement of the paper "DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment"
An open source code repository of driving world models, with training, inferencing, evaluation tools, and pretrained checkpoints.
[ICCV 2025] HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation
A generative world for general-purpose robotics & embodied AI learning.
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
This repo contains the code for 1D tokenizer and generator
[NeurIPS 2024] DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model
[WACV'25 Oral] Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Refine high-quality datasets and visual AI models
[NeurIPS2024] Cross-video Identity Correlating for Person Re-identification Pre-training
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓,同时包含工作和科研过程中的新想法、新问题、新资源与新项目
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[ECCV'24] GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image
Generative Models by Stability AI
[NeurIPS 2024] A Generalizable World Model for Autonomous Driving
[ECCV 2024] Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction
OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving
[ICLR 2024 Oral] Generative Gaussian Splatting for Efficient 3D Content Creation