Stars
IMG: Calibrating Diffusion Models via Implicit Multimodal Guidance, ICCV 2025
Official Repository of Absolute Zero Reasoner
repo for paper https://arxiv.org/abs/2504.13837
[CVPR 2025] Official repo for ART:Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation
[ICML 2024] SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning
[TPAMI 2024] Probabilistic Contrastive Learning for Long-Tailed Visual Recognition
[ECCV 2024] Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
[NeurIPS 2022] Latency-aware Spatial-wise Dynamic Networks
[IEEE TPAMI] Latency-aware Unified Dynamic Networks for Efficient Image Recognition
[ECCV 2022] Learning to Weight Samples for Dynamic Early-exiting Networks
[IEEE TIP] Fine-grained Recognition with Learnable Semantic Data Augmentation
[ICCV 2023] Adaptive Rotated Convolution for Rotated Object Detection
This repository is for the first comprehensive survey on Meta AI's Segment Anything Model (SAM).
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models arXiv 2023 / CVPR 2024
[NeurIPS 2023] Rank-DETR for High Quality Object Detection
A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''
This repo includes ChatGPT prompt curation to use ChatGPT and other LLM tools better.
Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Attention
1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundation visual backbones.
Code release for Deep Incubation (https://arxiv.org/abs/2212.04129)
[CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
[Pattern Recognition 2025] Cross-Modal Adapter for Vision-Language Retrieval