Stars
[CVPR 2025] DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
PyTorch code and models for the DINOv2 self-supervised learning method.
[CVPR2023] LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross-Modal Fusion
[NeurIPS 2025] Official code of Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting
An official implementation of the Anchor DETR.
⭐⭐⭐FightingCV Paper Reading, which helps you understand the most advanced research work in an easier way 🍀 🍀 🍀
[CVPR 2022 Oral] Official implementation of DN-DETR
[ECCV`24&ICLR`25] CityGaussian Series for High-quality Large-Scale Scene Reconstruction with Gaussians
[AAAI2024] Far3D: Expanding the Horizon for Surround-view 3D Object Detection
An open source community implementation of the model from the paper: "Movie Gen: A Cast of Media Foundation Models". Join our community to help implement this model!
Object tracking measure in javascript (MOTA, IDF1 ...)
A suite of image and video neural tokenizers
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
[ICCV 2025] Official implementation of the paper “MagicDrive-V2: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control”
[CVPR 2024] Official implementation of "Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction"
EGSRAL: An Enhanced 3D Gaussian Splatting based Renderer with Automated Labeling for Large-Scale Driving Scene
[CVPR'25]Tora: Trajectory-oriented Diffusion Transformer for Video Generation
Fine-Grained Open Domain Image Animation with Motion Guidance
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
【三年面试五年模拟】AIGC算法工程师面试秘籍。涵盖AIGC、传统深度学习、自动驾驶、AI Agent、机器学习、计算机视觉、自然语言处理、强化学习、具身智能、元宇宙、AGI等AI行业面试笔试干货经验与核心知识。
[WACV2025] Official PyTorch implementation of TrackDiffusion (https://arxiv.org/abs/2312.00651)
A diffuser implementation of Zero123. Zero-1-to-3: Zero-shot One Image to 3D Object (ICCV23)
[ICLR'25] UniDrive: Towards Universal Driving Perception Across Camera Configurations