Stars
[CVPR 2026] InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields
Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"
[CVPR 2025 Highlight] DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Universal Monocular Metric Depth Estimation
Towards Scalable Pre-training of Visual Tokenizers for Generation
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
[CVPR 2025] RollingDepth: Video Depth without Video Models
[CVPR 2025 Highlight] Video Depth Anything: Consistent Depth Estimation for Super-Long Videos
[ICLR 2026] Deforming Videos to Masks: Flow Matching for Referring Video Segmentation (FlowRVS)
🔥 🔥 🔥 A paper list of some recent Computer Vision(CV) works
A next.js web application that integrates AI capabilities with draw.io diagrams. This app allows you to create, modify, and enhance diagrams through natural language commands and AI-assisted visual…
[ECCV 2018] CCPD: a diverse and well-annotated dataset for license plate detection and recognition
library to read/write .npy and .npz files in C/C++
CUDA accelerated rasterization of gaussian splatting
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
🚁🚀 2026机场推荐 | 熊猫VPN(PandaVPNPro)已确定跑路!快连VPN ,小牛加速器(小牛VPN)体验不佳且价格贵。低价机场,便宜机场,平价机场,性价比机场,高速机场,稳定机场,一元机场,翻墙机场,付费机场,收费机场,冷门机场,优质机场,廉价机场。机场评测,中国翻墙,科学上网,梯子。非永久免费梯子,官网,非永久免费VPN,非免费机场!VPN China。按量计费机场,按量付费…
The code for PixelRefer & VideoRefer
Effortless data labeling with AI support from Segment Anything and other awesome models.
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Stable Diffusion with Core ML on Apple Silicon
[ACMMM 2025] This repo is the official implementation of "Learning Partially-Decorrelated Common Spaces for Ad-hoc Video Search"
A comprehensive survey of Composed Multi-modal Retrieval (CMR), including Composed Image Retrieval (CIR) and Composed Video Retrieval (CVR), etc.