Stars
A powerful tool for creating fine-tuning datasets for LLM
MomentMix Augmentation with Length-Aware DETR for Temporally Robust Moment Retrieval
[CVPR 2025 Highlight] Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis
collection of diffusion model papers categorized by their subareas
A collection of resources and papers on Diffusion Models
[IROS 2024] Learning from Spatio-temporal Correlation for Semi-Supervised LiDAR Semantic Segmentation
Official PyTorch implementation of “MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation”
Pytorch implementation of Twelve Labs' Video Foundation Model evaluation framework & open embeddings
[NeurIPS 2024] Understanding Multi-Granularity for Open-Vocabulary Part Segmentation
Model Stock: All we need is just a few fine-tuned models
OpenMMLab Semantic Segmentation Toolbox and Benchmark.
Official code for the NeurIPS 2023 paper "Switching Temporary Teachers for Semi-Supervised Semantic Segmentation"
Official Pytorch implementation of "Improved Probabilistic Image-Text Representations" (ICLR 2024)
TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision
MoVQGAN - model for the image encoding and reconstruction
iBOT 🤖: Image BERT Pre-Training with Online Tokenizer (ICLR 2022)
Taming Transformers for High-Resolution Image Synthesis
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object Detection"
detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
A re-implementation of SegFormer. See the official repo at https://github.com/NVlabs/SegFormer.
[ICLR 2023 Spotlight] GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation