This repository contains the list of papers accepted for oral presentation at CVPR 2025

2025

OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels, CVPR 2025. [Paper | Code]
Alias-Free Latent Diffusion Models: Improving Fractional Shift Equivariance of Diffusion Latent Space, CVPR 2025. [Paper | Project | Code]
3D Student Splatting and Scooping, CVPR 2025. [Paper | Code]
CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models, CVPR 2025. [Paper | Project | Code]
Reconstructing Humans with a Biomechanically Accurate Skeleton, CVPR 2025. [Paper | Project | Code]
Multi-view Reconstruction via SfM-guided Monocular Depth Estimation, CVPR 2025. [Paper | Project | Code]
Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models, CVPR 2025. [Paper | Project | Code]
CustAny: Customizing Anything from A Single Example, CVPR 2025. [Paper | Project | Code]
VGGT: Visual Geometry Grounded Transformer, CVPR 2025. [Paper | Project | Code] 🏆🏆🏆
Navigation World Models, CVPR 2025. [Paper | Project | Code]
MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos, CVPR 2025. [Paper | Project | Code]
Stereo4D: Learning How Things Move in 3D from Internet Stereo Videos, CVPR 2025. [Paper | Project | Code]
FoundationStereo: Zero-Shot Stereo Matching, CVPR 2025. [Paper | Project | Code]
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models, CVPR 2025. [Paper | Code]
The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition, CVPR 2025. [Paper | Project]
Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis, CVPR 2025. [Paper | Project | Code]
Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing, CVPR 2025. [Paper | Project | Code]
MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds, CVPR 2025. [Paper | Project | Code]
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization, CVPR 2025. [Paper | Project | Code]
Black-Box Forgery Attacks on Semantic Watermarks for Diffusion Models, CVPR 2025. [Paper | Code]
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models, CVPR 2025. [Paper | Project | Code]
DiffusionRenderer: Neural Inverse and Forward Rendering with Video Diffusion Models, CVPR 2025. [Paper | Project | Code] ]
FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video, CVPR 2025. [Paper | Project | Code]
OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation, CVPR 2025. [Paper | Project | Code]
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders, CVPR 2025. [Paper | Project | Code]
Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing, CVPR 2025. [Paper | Project | Code]
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea, CVPR 2025. [Paper | Project | Code]
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection, CVPR 2025. [Paper | Code]
SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing, CVPR 2025. [Paper | Project | Code]
OPA-DPO: Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key, CVPR 2025. [Paper | Project | Code]
Minority-Focused Text-to-Image Generation via Prompt Optimization, CVPR 2025. [Paper | Code]
Autoregressive Distillation of Diffusion Transformers, CVPR 2025. [Paper | Code]
Adv-CPG: A Customized Portrait Generation Framework with Facial Adversarial Attacks, CVPR 2025. [Paper | Code]
TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion, CVPR 2025. [Paper | Code]
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models, CVPR 2025. [Paper | Blog | Code]
CUT3R: Continuous 3D Perception Model with Continuous 3D Perception Model with Persistent State State, CVPR 2025. [Paper | Project | Code]
Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key, CVPR 2025. [Paper | Project | Code]
Birth and Death of a Rose, CVPR 2025. [Paper | Project | Code]
Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content, CVPR 2025. [Paper | Code]
Rethinking Vision-Language Model in Face Forensics: Multi-Modal Interpretable Forged Face Detector, CVPR 2025. [Paper | Code]
Exploring CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation, CVPR 2025. [Paper | Code]
LoRASculpt: Sculpting LoRA for Harmonizing General and Specialized Knowledge in Multimodal Large Language Models, CVPR 2025. [Paper]
FedSPA: Generalizable Federated Graph Learning under Homophily Heterogeneity, CVPR 2025. [Paper]
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces, CVPR 2025. [Paper | Project | Code]
Video-XL Family: Efficient VLMs for Extremely Long Video Understanding, CVPR 2025. [Paper| Code]
Neural Inverse Rendering from Propagating Light, CVPR 2025. [Paper | Project | Code] 🏆🏆🏆
MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision, CVPR 2025. [Paper | Project | Code]
Motion Prompting: Controlling Video Generation with Motion Trajectories, CVPR 2025. [Paper | Project]
Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise, CVPR 2025. [Paper | Project | Code]
LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping, CVPR 2025. [Paper | Project]
LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions, CVPR 2025. [Paper | Code]
Do We Always Need the Simplicity Bias? Looking for Optimal Inductive Biases in the Wild, CVPR 2025. [Paper | Project]
CleanDIFT: Diffusion Features without Noise, CVPR 2025. [Paper | Project | Code]
Towards Explicit Geometry-Reflectance Collaboration for Generalized LiDAR Segmentation in Adverse Weather, CVPR 2025. [Paper]
DiffFNO: Diffusion Fourier Neural Operator, CVPR 2025. [Paper | Project]
Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining, CVPR 2025. [Paper]
CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner, CVPR 2025. [Paper | Project | Code]
Reanimating Images using Neural Representations of Dynamic Stimuli, CVPR 2025. [Paper]
EgoLM: Multi-Modal Language Model of Egocentric Motions, CVPR 2025. [Paper | Project]
MEGA: Masked Generative Autoencoder for Human Mesh Recovery, CVPR 2025. [Paper]
Descriptor-In-Pixel : Point-Feature Tracking for Pixel Processor Arraysr, CVPR 2025. [Project]
Temporally Consistent Object-Centric Learning by Contrasting Slots, CVPR 2025. [Paper | Project | Code]
Temporal Alignment-Free Video Matching for Few-shot Action Recognition, CVPR 2025. [Paper | Project | Code]
One Category One Prompt: Dataset Distillation using Diffusion Models, CVPR 2025. [Paper | Code]
IceDiff: High Resolution and High-Quality Sea Ice Forecasting with Generative Diffusion Prior, CVPR 2025. [Paper]
Efficient Test-time Adaptive Object Detection via Sensitivity-Guided Pruning, CVPR 2025. [Paper]
Keep the Balance: A Parameter-Efficient Symmetrical Framework for RGB+X Semantic Segmentation, CVPR 2025. [Paper]
Identifying and Mitigating Position Bias of Multi-image Vision-Language Models, CVPR 2025. [Paper]
From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons, CVPR 2025. [Paper]
Language-Guided Image Tokenization for Generation, CVPR 2025. [Paper | Project]
DreamRelation: Bridging Customization and Relation Generation, CVPR 2025. [Paper | Project | Code]
GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skills, CVPR 2025. [Paper | Project | Code]
Viewpoint Rosetta Stone: Unlocking Unpaired Ego-Exo Videos for View-invariant Representation Learning, CVPR 2025. [Project]
DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution, CVPR 2025. [Paper |Code]
Convex Relaxation for Robust Vanishing Point Estimation in Manhattan World, CVPR 2025. [Paper|Code]
Learned Binocular-Encoding Optics for RGBD Imaging Using Joint Stereo and Focus Cues, CVPR 2025.[Project]
Camera resection from known line pencils and a radially distorted scanline, CVPR 2025. [Code]
Opportunistic Single-Photon Time of Flight, CVPR 2025. [Paper]
DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models, CVPR 2025. [Paper]
DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution, CVPR 2025. [Paper | Code]
UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming, CVPR 2025. [Paper]
Geometric Knowledge-Guided Localized Global Distribution Alignment for Federated Learning, CVPR 2025. [Paper | Code]
Enhancing Diversity for Data-free Quantization, CVPR 2025. [Paper]
TopoCellGen: Generating Histopathology Cell Topology with a Diffusion Model, CVPR 2025. [Paper | Code]
Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation, CVPR 2025. [Paper]
Time of the Flight of the Gaussians: Fast and Accurate Dynamic Time-of-Flight Radiance Field, CVPR 2025. [Paper | Project | Code]
Zero-Shot Monocular Scene Flow Estimation in the Wild, CVPR 2025. [Paper | Project]
DNF: Unconditional 4D Generation with Dictionary-based Neural Fields, CVPR 2025. [Paper | Project | Code]
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models, CVPR 2025. [Paper | Project]
Effective SAM Combination for Open-Vocabulary Semantic Segmentation, CVPR 2025. [Paper]
Removing Reflections from RAW Photos, CVPR 2025. [Paper | Project]
Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens, CVPR 2025. [Paper | Project | Code]
Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding, CVPR 2025. [Paper | Project | Code]
Towards Vision Language Models For Extra-Long Video Understanding, CVPR 2025. [Paper | Code]
SEAL: Semantic Attention Learning for Long Video Representation, CVPR 2025. [Paper]
Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval, CVPR 2025. [Paper]

Acknowledgements

We would like to express our gratitude for the code repository provided in cvpr25_oral_gpu_info.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

This repository contains the list of papers accepted for oral presentation at CVPR 2025

2025

Acknowledgements

About

Uh oh!

Releases

Packages

yejun688/CVPR2025_oral_paper_list

Folders and files

Latest commit

History

Repository files navigation

This repository contains the list of papers accepted for oral presentation at CVPR 2025

2025

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages