Stars
Official repo for paper "SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass"
Official code for the paper "LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes".
Refine high-quality datasets and visual AI models
Object detection and tracking algorithm implemented for Real-Time video streams and static images.
opencv+yolov8+deepsort行人检测与跟踪,以及可选的WebUI界面(基于gradio)
ACM Multimedia2020 University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization 🚁 annotates 1652 buildings in 72 universities around the world.
Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research
Demonstration of running a native LLM on Android device.
Object detection, 3D detection, and pose estimation using center point detection:
detector for rotated-object based on CenterNet/基于CenterNet的旋转目标检测
The official implementation of the crowd counting model CLIP-EBC.
microsoft / vscode-python
Forked from DonJayamanne/pythonVSCodePython extension for Visual Studio Code
[PVLDB 2024 Best Paper Nomination] TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting Methods
Real time one-stage multi-class & multi-object tracking based on anchor-free detection and ReID
[CVPR 2025] Multiple Object Tracking as ID Prediction
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
Context-Aware Multi-View Summarization Network for Image-Text Matching. ACM MM'20
本项目支持对剪枝后的yolov5模型进行知识蒸馏训练(This project supports knowledge distillation training for the pruned YOLOv5 model)
Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities
Official code for "EagerMOT: 3D Multi-Object Tracking via Sensor Fusion" [ICRA 2021]
[ICCV 2023] Tracking Anything with Decoupled Video Segmentation
(IJCV2024 & ICCV2023) LSKNet: A Foundation Lightweight Backbone for Remote Sensing
Official Implementation of CVPR24 highlight paper: Matching Anything by Segmenting Anything
[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.