Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

yejun688/CVPR2025_oral_paper_list

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

47 Commits
Β 
Β 

Repository files navigation

This repository contains the list of papers accepted for oral presentation at CVPR 2025

2025

  • OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels, CVPR 2025. [Paper | Code]

  • Alias-Free Latent Diffusion Models: Improving Fractional Shift Equivariance of Diffusion Latent Space, CVPR 2025. [Paper | Project | Code]

  • 3D Student Splatting and Scooping, CVPR 2025. [Paper | Code]

  • CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models, CVPR 2025. [Paper | Project | Code]

  • Reconstructing Humans with a Biomechanically Accurate Skeleton, CVPR 2025. [Paper | Project | Code]

  • Multi-view Reconstruction via SfM-guided Monocular Depth Estimation, CVPR 2025. [Paper | Project | Code]

  • Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models, CVPR 2025. [Paper | Project | Code]

  • CustAny: Customizing Anything from A Single Example, CVPR 2025. [Paper | Project | Code]

  • VGGT: Visual Geometry Grounded Transformer, CVPR 2025. [Paper | Project | Code] πŸ†πŸ†πŸ†

  • Navigation World Models, CVPR 2025. [Paper | Project | Code]

  • MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos, CVPR 2025. [Paper | Project | Code]

  • Stereo4D: Learning How Things Move in 3D from Internet Stereo Videos, CVPR 2025. [Paper | Project | Code]

  • FoundationStereo: Zero-Shot Stereo Matching, CVPR 2025. [Paper | Project | Code]

  • Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models, CVPR 2025. [Paper | Code]

  • The PanAf-FGBG Dataset: Understanding the Impact of Backgrounds in Wildlife Behaviour Recognition, CVPR 2025. [Paper | Project]

  • Infinity: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis, CVPR 2025. [Paper | Project | Code]

  • Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing, CVPR 2025. [Paper | Project | Code]

  • MV-DUSt3R+: Single-Stage Scene Reconstruction from Sparse Views In 2 Seconds, CVPR 2025. [Paper | Project | Code]

  • TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization, CVPR 2025. [Paper | Project | Code]

  • Black-Box Forgery Attacks on Semantic Watermarks for Diffusion Models, CVPR 2025. [Paper | Code]

  • Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models, CVPR 2025. [Paper | Project | Code]

  • DiffusionRenderer: Neural Inverse and Forward Rendering with Video Diffusion Models, CVPR 2025. [Paper | Project | Code] ]

  • FluidNexus: 3D Fluid Reconstruction and Prediction from a Single Video, CVPR 2025. [Paper | Project | Code]

  • OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation, CVPR 2025. [Paper | Project | Code]

  • RandAR: Decoder-only Autoregressive Visual Generation in Random Orders, CVPR 2025. [Paper | Project | Code]

  • Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing, CVPR 2025. [Paper | Project | Code]

  • AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea, CVPR 2025. [Paper | Project | Code]

  • VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection, CVPR 2025. [Paper | Code]

  • SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing, CVPR 2025. [Paper | Project | Code]

  • OPA-DPO: Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key, CVPR 2025. [Paper | Project | Code]

  • Minority-Focused Text-to-Image Generation via Prompt Optimization, CVPR 2025. [Paper | Code]

  • Autoregressive Distillation of Diffusion Transformers, CVPR 2025. [Paper | Code]

  • Adv-CPG: A Customized Portrait Generation Framework with Facial Adversarial Attacks, CVPR 2025. [Paper | Code]

  • TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion, CVPR 2025. [Paper | Code]

  • Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models, CVPR 2025. [Paper | Blog | Code]

  • CUT3R: Continuous 3D Perception Model with Continuous 3D Perception Model with Persistent State State, CVPR 2025. [Paper | Project | Code]

  • Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key, CVPR 2025. [Paper | Project | Code]

  • Birth and Death of a Rose, CVPR 2025. [Paper | Project | Code]

  • Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content, CVPR 2025. [Paper | Code]

  • Rethinking Vision-Language Model in Face Forensics: Multi-Modal Interpretable Forged Face Detector, CVPR 2025. [Paper | Code]

  • Exploring CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation, CVPR 2025. [Paper | Code]

  • LoRASculpt: Sculpting LoRA for Harmonizing General and Specialized Knowledge in Multimodal Large Language Models, CVPR 2025. [Paper]

  • FedSPA: Generalizable Federated Graph Learning under Homophily Heterogeneity, CVPR 2025. [Paper]

  • Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces, CVPR 2025. [Paper | Project | Code]

  • Video-XL Family: Efficient VLMs for Extremely Long Video Understanding, CVPR 2025. [Paper| Code]

  • Neural Inverse Rendering from Propagating Light, CVPR 2025. [Paper | Project | Code] πŸ†πŸ†πŸ†

  • MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision, CVPR 2025. [Paper | Project | Code]

  • Motion Prompting: Controlling Video Generation with Motion Trajectories, CVPR 2025. [Paper | Project]

  • Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise, CVPR 2025. [Paper | Project | Code]

  • LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping, CVPR 2025. [Paper | Project]

  • LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions, CVPR 2025. [Paper | Code]

  • Do We Always Need the Simplicity Bias? Looking for Optimal Inductive Biases in the Wild, CVPR 2025. [Paper | Project]

  • CleanDIFT: Diffusion Features without Noise, CVPR 2025. [Paper | Project | Code]

  • Towards Explicit Geometry-Reflectance Collaboration for Generalized LiDAR Segmentation in Adverse Weather, CVPR 2025. [Paper]

  • DiffFNO: Diffusion Fourier Neural Operator, CVPR 2025. [Paper | Project]

  • Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining, CVPR 2025. [Paper]

  • CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner, CVPR 2025. [Paper | Project | Code]

  • Reanimating Images using Neural Representations of Dynamic Stimuli, CVPR 2025. [Paper]

  • EgoLM: Multi-Modal Language Model of Egocentric Motions, CVPR 2025. [Paper | Project]

  • MEGA: Masked Generative Autoencoder for Human Mesh Recovery, CVPR 2025. [Paper]

  • Descriptor-In-Pixel : Point-Feature Tracking for Pixel Processor Arraysr, CVPR 2025. [Project]

  • Temporally Consistent Object-Centric Learning by Contrasting Slots, CVPR 2025. [Paper | Project | Code]

  • Temporal Alignment-Free Video Matching for Few-shot Action Recognition, CVPR 2025. [Paper | Project | Code]

  • One Category One Prompt: Dataset Distillation using Diffusion Models, CVPR 2025. [Paper | Code]

  • IceDiff: High Resolution and High-Quality Sea Ice Forecasting with Generative Diffusion Prior, CVPR 2025. [Paper]

  • Efficient Test-time Adaptive Object Detection via Sensitivity-Guided Pruning, CVPR 2025. [Paper]

  • Keep the Balance: A Parameter-Efficient Symmetrical Framework for RGB+X Semantic Segmentation, CVPR 2025. [Paper]

  • Identifying and Mitigating Position Bias of Multi-image Vision-Language Models, CVPR 2025. [Paper]

  • From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons, CVPR 2025. [Paper]

  • Language-Guided Image Tokenization for Generation, CVPR 2025. [Paper | Project]

  • DreamRelation: Bridging Customization and Relation Generation, CVPR 2025. [Paper | Project | Code]

  • GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skills, CVPR 2025. [Paper | Project | Code]

  • Viewpoint Rosetta Stone: Unlocking Unpaired Ego-Exo Videos for View-invariant Representation Learning, CVPR 2025. [Project]

  • DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution, CVPR 2025. [Paper |Code]

  • Convex Relaxation for Robust Vanishing Point Estimation in Manhattan World, CVPR 2025. [Paper|Code]

  • Learned Binocular-Encoding Optics for RGBD Imaging Using Joint Stereo and Focus Cues, CVPR 2025.[Project]

  • Camera resection from known line pencils and a radially distorted scanline, CVPR 2025. [Code]

  • Opportunistic Single-Photon Time of Flight, CVPR 2025. [Paper]

  • DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models, CVPR 2025. [Paper]

  • DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution, CVPR 2025. [Paper | Code]

  • UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming, CVPR 2025. [Paper]

  • Geometric Knowledge-Guided Localized Global Distribution Alignment for Federated Learning, CVPR 2025. [Paper | Code]

  • Enhancing Diversity for Data-free Quantization, CVPR 2025. [Paper]

  • TopoCellGen: Generating Histopathology Cell Topology with a Diffusion Model, CVPR 2025. [Paper | Code]

  • Enhancing SAM with Efficient Prompting and Preference Optimization for Semi-supervised Medical Image Segmentation, CVPR 2025. [Paper]

  • Time of the Flight of the Gaussians: Fast and Accurate Dynamic Time-of-Flight Radiance Field, CVPR 2025. [Paper | Project | Code]

  • Zero-Shot Monocular Scene Flow Estimation in the Wild, CVPR 2025. [Paper | Project]

  • DNF: Unconditional 4D Generation with Dictionary-based Neural Fields, CVPR 2025. [Paper | Project | Code]

  • CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models, CVPR 2025. [Paper | Project]

  • Effective SAM Combination for Open-Vocabulary Semantic Segmentation, CVPR 2025. [Paper]

  • Removing Reflections from RAW Photos, CVPR 2025. [Paper | Project]

  • Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens, CVPR 2025. [Paper | Project | Code]

  • Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding, CVPR 2025. [Paper | Project | Code]

  • Towards Vision Language Models For Extra-Long Video Understanding, CVPR 2025. [Paper | Code]

  • SEAL: Semantic Attention Learning for Long Video Representation, CVPR 2025. [Paper]

  • Learning Audio-guided Video Representation with Gated Attention for Video-Text Retrieval, CVPR 2025. [Paper]

Acknowledgements

We would like to express our gratitude for the code repository provided in cvpr25_oral_gpu_info.