Stars
Automatic Video Generation from Scientific Papers
[NeurIPS 2025 spotlight] Official implementation for "FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving"
Wan: Open and Advanced Large-Scale Video Generative Models
Reference PyTorch implementation and models for DINOv3
⏰ Collaboratively track worldwide conference deadlines (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
[CVPR 2025] UniVAD: A Training-free Unified Model for Few-shot Visual Anomaly Detection
The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade development toolkit based on PaddlePaddle.
RT-DATR: Enhancing Real-time Unsupervised Domain Adaptive Detection Transformer for Autonomous Driving
[CVPR 2025 Oral] OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high …
Official implementation of the WACV 2025 ( Oral ) paper. RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positive Supervision.
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
Official implementation of the NeurIPS 2023 paper MonoUNI: A Unified Vehicle and Infrastructure-side Monocular 3D Object Detection Network with Sufficient Depth Clues.
Official implementation of the 3DV 2024 paper MonoLSS: Learnable Sample Selection For Monocular 3D Detection
Official implementation of the CVPR paper Open-TransMind: A New Baseline and Benchmark for 1st Foundation Model Challenge of Intelligent Transportation
Official implementation of the CVPR 2024 paper ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions.
Official PyTorch implementation for a conditional diffusion probability model in BEV perception