Thanks to visit codestin.com
Credit goes to github.com

Skip to content

This is a repository dedicated to high quality figures from ICLR 2025 papers.

Notifications You must be signed in to change notification settings

Mondrian-He/awesome-iclr-2025-artist

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

awesome-iclr-2025-artist

Awesome GitHub Repo stars

Important

If you need to look at other conferences such as NeurIPS, ICLR, ICML, EMNLP, or ACL, you can check out Awesome-artist !!!🤩🤩🤩

Note

This project repository contains the long papers from ICLR 2025. Each paper’s framework diagrams, experimental figures, and other visuals are extracted to study their presentation techniques. Since the content is extensive and a single Markdown file cannot render everything reliably, we split it into 100 separate Markdown files, each covering approximately thirty-two papers. The following section indexes where each paper is located😁😁. Hope we can make progress together!


Warning

The README shown on the repository homepage may be automatically truncated. To view the full version, you can open the README file directly. 🤨


Important

Papers 1 to 211 are Oral.
Papers 212 to 586 are Spotlight.
The rest are Poster.


📚 Complete Paper Index

Total Papers: 3687

Split into 100 parts for better browsing

📖 Parts Summary

column1 column2 column3 column4 column5 column6 column7 column8
Part 1: 37 papers Part 2: 37 papers Part 3: 37 papers Part 4: 37 papers Part 5: 37 papers Part 6: 37 papers Part 7: 37 papers Part 8: 37 papers
Part 9: 37 papers Part 10: 37 papers Part 11: 37 papers Part 12: 37 papers Part 13: 37 papers Part 14: 37 papers Part 15: 37 papers Part 16: 37 papers
Part 17: 37 papers Part 18: 37 papers Part 19: 37 papers Part 20: 37 papers Part 21: 37 papers Part 22: 37 papers Part 23: 37 papers Part 24: 37 papers
Part 25: 37 papers Part 26: 37 papers Part 27: 37 papers Part 28: 37 papers Part 29: 37 papers Part 30: 37 papers Part 31: 37 papers Part 32: 37 papers
Part 33: 37 papers Part 34: 37 papers Part 35: 37 papers Part 36: 37 papers Part 37: 37 papers Part 38: 37 papers Part 39: 37 papers Part 40: 37 papers
Part 41: 37 papers Part 42: 37 papers Part 43: 37 papers Part 44: 37 papers Part 45: 37 papers Part 46: 37 papers Part 47: 37 papers Part 48: 37 papers
Part 49: 37 papers Part 50: 37 papers Part 51: 37 papers Part 52: 37 papers Part 53: 37 papers Part 54: 37 papers Part 55: 37 papers Part 56: 37 papers
Part 57: 37 papers Part 58: 37 papers Part 59: 37 papers Part 60: 37 papers Part 61: 37 papers Part 62: 37 papers Part 63: 37 papers Part 64: 37 papers
Part 65: 37 papers Part 66: 37 papers Part 67: 37 papers Part 68: 37 papers Part 69: 37 papers Part 70: 37 papers Part 71: 37 papers Part 72: 37 papers
Part 73: 37 papers Part 74: 37 papers Part 75: 37 papers Part 76: 37 papers Part 77: 37 papers Part 78: 37 papers Part 79: 37 papers Part 80: 37 papers
Part 81: 37 papers Part 82: 37 papers Part 83: 37 papers Part 84: 37 papers Part 85: 37 papers Part 86: 37 papers Part 87: 37 papers Part 88: 37 papers
Part 89: 37 papers Part 90: 37 papers Part 91: 37 papers Part 92: 37 papers Part 93: 37 papers Part 94: 37 papers Part 95: 37 papers Part 96: 37 papers
Part 97: 37 papers Part 98: 37 papers Part 99: 37 papers Part 100: 24 papers

📝 All Papers by Title

  1. Probabilistic Learning to Defer: Handling Missing Expert Annotations and Controlling Workload Distribution

  2. Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks

  3. Joint Graph Rewiring and Feature Denoising via Spectral Resonance

  4. AFlow: Automating Agentic Workflow Generation

  5. DSPO: Direct Score Preference Optimization for Diffusion Model Alignment

  6. Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo

  7. Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning

  8. OLMoE: Open Mixture-of-Experts Language Models

  9. Learning to Discretize Denoising Diffusion ODEs

  10. When Selection Meets Intervention: Additional Complexities in Causal Discovery

  11. Scaling Laws for Precision

  12. Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

  13. Progressive distillation induces an implicit curriculum

  14. Diffusion-Based Planning for Autonomous Driving with Flexible Guidance

  15. Open-World Reinforcement Learning over Long Short-Term Imagination

  16. Faster Cascades via Speculative Decoding

  17. DEPT: Decoupled Embeddings for Pre-training Language Models

  18. CyberHost: A One-stage Diffusion Framework for Audio-driven Talking Body Generation

  19. When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers

  20. Learning to Search from Demonstration Sequences

  21. Learning Distributions of Complex Fluid Simulations with Diffusion Graph Networks

  22. Capturing the Temporal Dependence of Training Data Influence

  23. Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models

  24. Scaling In-the-Wild Training for Diffusion-based Illumination Harmonization and Editing by Imposing Consistent Light Transport

  25. Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

  26. Robustness Inspired Graph Backdoor Defense

  27. Flow Matching with General Discrete Paths: A Kinetic-Optimal Perspective

  28. Scaling and evaluating sparse autoencoders

  29. Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models

  30. Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation

  31. Learning Dynamics of LLM Finetuning

  32. Classic but Everlasting: Traditional Gradient-Based Algorithms Converge Fast Even in Time-Varying Multi-Player Games

  33. MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation Experts

  34. Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates

  35. Tractable Multi-Agent Reinforcement Learning through Behavioral Economics

  36. Do as We Do, Not as You Think: the Conformity of Large Language Models

  37. Linear Representations of Political Perspective Emerge in Large Language Models

  38. Rethinking Reward Modeling in Preference-based Large Language Model Alignment

  39. Homomorphism Expressivity of Spectral Invariant Graph Neural Networks

  40. PathGen-1.6M: 1.6 Million Pathology Image-text Pairs Generation through Multi-agent Collaboration

  41. Consistency Checks for Language Model Forecasters

  42. Towards a Complete Logical Framework for GNN Expressiveness

  43. On Scaling Up 3D Gaussian Splatting Training

  44. Data Scaling Laws in Imitation Learning for Robotic Manipulation

  45. MaestroMotif: Skill Design from Artificial Intelligence Feedback

  46. DarkBench: Benchmarking Dark Patterns in Large Language Models

  47. On the Benefits of Memory for Modeling Time-Dependent PDEs

  48. ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding

  49. CAX: Cellular Automata Accelerated in JAX

  50. Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

  51. Artificial Kuramoto Oscillatory Neurons

  52. Transformers Provably Solve Parity Efficiently with Chain of Thought

  53. Judge Decoding: Faster Speculative Sampling Requires Going Beyond Model Alignment

  54. Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models

  55. Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation

  56. WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

  57. LLM-SR: Scientific Equation Discovery via Programming with Large Language Models

  58. Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents

  59. Learning and aligning single-neuron invariance manifolds in visual cortex

  60. Standard Gaussian Process is All You Need for High-Dimensional Bayesian Optimization

  61. Copyright-Protected Language Generation via Adaptive Model Fusion

  62. Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment

  63. Feedback Schrödinger Bridge Matching

  64. Root Cause Analysis of Anomalies in Multivariate Time Series through Granger Causal Discovery

  65. Instant Policy: In-Context Imitation Learning via Graph Diffusion

  66. GeSubNet: Gene Interaction Inference for Disease Subtype Network Generation

  67. Training on the Test Task Confounds Evaluation and Emergence

  68. Rethinking the generalization of drug target affinity prediction algorithms via similarity aware evaluation

  69. Can Neural Networks Achieve Optimal Computational-statistical Tradeoff? An Analysis on Single-Index Model

  70. AI as Humanity’s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text

  71. Second-Order Min-Max Optimization with Lazy Hessians

  72. Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics

  73. Cross-Entropy Is All You Need To Invert the Data Generating Process

  74. On the Role of Attention Heads in Large Language Model Safety

  75. Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning

  76. Learning stochastic dynamics from snapshots through regularized unbalanced optimal transport

  77. Composing Unbalanced Flows for Flexible Docking and Relaxation

  78. A Computational Framework for Modeling Emergence of Color Vision in the Human Brain

  79. Improving Probabilistic Diffusion Models With Optimal Diagonal Covariance Matching

  80. BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models

  81. Combatting Dimensional Collapse in LLM Pre-Training Data via Submodular File Selection

  82. Influence Functions for Scalable Data Attribution in Diffusion Models

  83. Language Representations Can be What Recommenders Need: Findings and Potentials

  84. Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition

  85. Your Mixture-of-Experts LLM Is Secretly an Embedding Model for Free

  86. Emergence of meta-stable clustering in mean-field transformer models

  87. Data Selection via Optimal Control for Language Models

  88. Feedback Favors the Generalization of Neural ODEs

  89. Comparing noisy neural population dynamics using optimal transport distances

  90. Subgraph Federated Learning for Local Generalization

  91. RB-Modulation: Training-Free Stylization using Reference-Based Modulation

  92. The Geometry of Categorical and Hierarchical Concepts in Large Language Models

  93. TopoLM: brain-like spatio-functional organization in a topographic language model

  94. Problem-Parameter-Free Federated Learning

  95. Latent Bayesian Optimization via Autoregressive Normalizing Flows

  96. BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

  97. ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement

  98. LaMPlace: Learning to Optimize Cross-Stage Metrics in Macro Placement

  99. MOS: Model Synergy for Test-Time Adaptation on LiDAR-Based 3D Object Detection

  100. On Conformal Isometry of Grid Cells: Learning Distance-Preserving Position Embedding

  101. Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

  102. EmbodiedSAM: Online Segment Any 3D Thing in Real Time

  103. Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping

  104. LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior

  105. Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding

  106. Self-Improvement in Language Models: The Sharpening Mechanism

  107. Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models

  108. LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization

  109. Reasoning Elicitation in Language Models via Counterfactual Feedback

  110. Attention as a Hypernetwork

  111. Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues

  112. Learning Randomized Algorithms with Transformers

  113. Trust or Escalate: LLM Judges with Provable Guarantees for Human Agreement

  114. HiRA: Parameter-Efficient Hadamard High-Rank Adaptation for Large Language Models

  115. Proteina: Scaling Flow-based Protein Structure Generative Models

  116. REEF: Representation Encoding Fingerprints for Large Language Models

  117. A Decade's Battle on Dataset Bias: Are We There Yet?

  118. Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance

  119. Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

  120. ECD: A Machine Learning Benchmark for Predicting Enhanced-Precision Electronic Charge Density in Crystalline Inorganic Materials

  121. Generator Matching: Generative modeling with arbitrary Markov processes

  122. Brain Bandit: A Biologically Grounded Neural Network for Efficient Control of Exploration

  123. Do LLMs Recognize Your Preferences? Evaluating Personalized Preference Following in LLMs

  124. LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias

  125. From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions

  126. Advantage Alignment Algorithms

  127. RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style

  128. PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding

  129. Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning

  130. Steering Protein Family Design through Profile Bayesian Flow

  131. Unlearning-based Neural Interpretations

  132. On the Hölder Stability of Multiset and Graph Neural Networks

  133. No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images

  134. KAN: Kolmogorov–Arnold Networks

  135. Differential Transformer

  136. One Step Diffusion via Shortcut Models

  137. FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference

  138. A Theoretically-Principled Sparse, Connected, and Rigid Graph Representation of Molecules

  139. REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments

  140. Limits to scalable evaluation at the frontier: LLM as judge won’t beat twice the data

  141. MAP: Multi-Human-Value Alignment Palette

  142. SANA: Efficient High-Resolution Text-to-Image Synthesis with Linear Diffusion Transformers

  143. Learning to Discover Regulatory Elements for Gene Expression Prediction

  144. Simplifying, Stabilizing and Scaling Continuous-time Consistency Models

  145. TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion Interpolation

  146. ShEPhERD: Diffusing shape, electrostatics, and pharmacophores for bioisosteric drug design

  147. miniCTX: Neural Theorem Proving with (Long-)Contexts

  148. Residual Deep Gaussian Processes on Manifolds

  149. Accelerated training through iterative gradient propagation along the residual path

  150. Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

  151. Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models

  152. AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models

  153. STAR: Synthesis of Tailored Architectures

  154. MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models

  155. SAM 2: Segment Anything in Images and Videos

  156. Data Shapley in One Training Run

  157. Oscillatory State-Space Models

  158. Restructuring Vector Quantization with the Rotation Trick

  159. MMQA: Evaluating LLMs with Multi-Table Multi-Hop Complex Questions

  160. GridMix: Exploring Spatial Modulation for Neural Fields in PDE Modeling

  161. More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness

  162. Population Transformer: Learning Population-level Representations of Neural Activity

  163. Inference Scaling for Long-Context Retrieval Augmented Generation

  164. Proxy Denoising for Source-Free Domain Adaptation

  165. Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs

  166. Topological Blindspots: Understanding and Extending Topological Deep Learning Through the Lens of Expressivity

  167. Retrieval Head Mechanistically Explains Long-Context Factuality

  168. MIND over Body: Adaptive Thinking using Dynamic Computation

  169. How much of my dataset did you use? Quantitative Data Usage Inference in Machine Learning

  170. SymmetricDiffusers: Learning Discrete Diffusion on Finite Symmetric Groups

  171. Cut Your Losses in Large-Vocabulary Language Models

  172. Interpreting Emergent Planning in Model-Free Reinforcement Learning

  173. Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

  174. Progressive Compression with Universally Quantized Diffusion Models

  175. High-Dynamic Radar Sequence Prediction for Weather Nowcasting Using Spatiotemporal Coherent Gaussian Representation

  176. Training Language Models to Self-Correct via Reinforcement Learning

  177. Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation

  178. What should a neuron aim for? Designing local objective functions based on information theory

  179. Backtracking Improves Generation Safety

  180. Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignment

  181. Global Convergence in Neural ODEs: Impact of Activation Functions

  182. Geometry of Neural Reinforcement Learning in Continuous State and Action Spaces

  183. The Hidden Cost of Waiting for Accurate Predictions

  184. DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications for Multi-Task RL

  185. The Complexity of Two-Team Polymatrix Games with Independent Adversaries

  186. Amortized Control of Continuous State Space Feynman-Kac Model for Irregular Time Series

  187. TetSphere Splatting: Representing High-Quality Geometry with Lagrangian Volumetric Meshes

  188. MoDeGPT: Modular Decomposition for Large Language Model Compression

  189. Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference under Ambiguities

  190. Geometry-aware RL for Manipulation of Varying Shapes and Deformable Objects

  191. MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

  192. Safety Alignment Should be Made More Than Just a Few Tokens Deep

  193. Variational Diffusion Posterior Sampling with Midpoint Guidance

  194. NeuralPlane: Structured 3D Reconstruction in Planar Primitives with Neural Fields

  195. SD-LoRA: Scalable Decoupled Low-Rank Adaptation for Class Incremental Learning

  196. Energy-based Backdoor Defense Against Federated Graph Learning

  197. Prioritized Generative Replay

  198. A Probabilistic Perspective on Unlearning and Alignment for Large Language Models

  199. Exploring The Loss Landscape Of Regularized Neural Networks Via Convex Duality

  200. Flat Reward in Policy Parameter Space Implies Robust Reinforcement Learning

  201. Scaling LLM Test-Time Compute Optimally Can be More Effective than Scaling Parameters for Reasoning

  202. Compositional Entailment Learning for Hyperbolic Vision-Language Models

  203. OptionZero: Planning with Learned Options

  204. On the Identification of Temporal Causal Representation with Instantaneous Dependence

  205. Towards Understanding Why FixMatch Generalizes Better Than Supervised Learning

  206. RMP-SAM: Towards Real-Time Multi-Purpose Segment Anything

  207. Open-Vocabulary Customization from CLIP via Data-Free Knowledge Distillation

  208. Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse

  209. TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis

  210. ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids

  211. Synthetic continued pretraining

  212. ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability

  213. LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

  214. Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations

  215. BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

  216. Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI

  217. MaRS: A Fast Sampler for Mean Reverting Diffusion based on ODE and SDE Solvers

  218. Robust Function-Calling for On-Device Language Model via Function Masking

  219. GOLD: Graph Out-of-Distribution Detection via Implicit Adversarial Latent Generation

  220. Advantage-Guided Distillation for Preference Alignment in Small Language Models

  221. JudgeLM: Fine-tuned Large Language Models are Scalable Judges

  222. Stem-OB: Generalizable Visual Imitation Learning with Stem-Like Convergent Observation through Diffusion Inversion

  223. Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control

  224. Better autoregressive regression with LLMs via regression-aware fine-tuning

  225. The Power of LLM-Generated Synthetic Data for Stance Detection in Online Political Discussions

  226. CausalRivers - Scaling up benchmarking of causal discovery for real-world time-series

  227. Tuning Frequency Bias of State Space Models

  228. Provably Accurate Shapley Value Estimation via Leverage Score Sampling

  229. GrabS: Generative Embodied Agent for 3D Object Segmentation without Scene Supervision

  230. Diffusion On Syntax Trees For Program Synthesis

  231. Effective Interplay between Sparsity and Quantization: From Theory to Practice

  232. VLMaterial: Procedural Material Generation with Large Vision-Language Models

  233. Analyzing Neural Scaling Laws in Two-Layer Networks with Power-Law Data Spectra

  234. Test-time Alignment of Diffusion Models without Reward Over-optimization

  235. SVDQuant: Absorbing Outliers by Low-Rank Component for 4-Bit Diffusion Models

  236. Decomposition Polyhedra of Piecewise Linear Functions

  237. Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models

  238. UniMatch: Universal Matching from Atom to Task for Few-Shot Drug Discovery

  239. SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints

  240. Counterfactual Realizability

  241. How Much is Unseen Depends Chiefly on Information About the Seen

  242. Can Watermarked LLMs be Identified by Users via Crafted Prompts?

  243. IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning

  244. Weighted Point Set Embedding for Multimodal Contrastive Learning Toward Optimal Similarity Metric

  245. On the Expressiveness of Rational ReLU Neural Networks With Bounded Depth

  246. Enhancing the Scalability and Applicability of Kohn-Sham Hamiltonians for Molecular Systems

  247. Improving Convergence Guarantees of Random Subspace Second-order Algorithm for Nonconvex Optimization

  248. Temporal Heterogeneous Graph Generation with Privacy, Utility, and Efficiency

  249. Knowledge Localization: Mission Not Accomplished? Enter Query Localization!

  250. Boosting Ray Search Procedure of Hard-label Attacks with Transfer-based Priors

  251. Uncovering Overfitting in Large Language Model Editing

  252. Effective post-training embedding compression via temperature control in contrastive training

  253. Bundle Neural Network for message diffusion on graphs

  254. Benchmarking Predictive Coding Networks -- Made Simple

  255. Graph Neural Networks Can (Often) Count Substructures

  256. LiveBench: A Challenging, Contamination-Limited LLM Benchmark

  257. Emergent Orientation Maps —— Mechanisms, Coding Efficiency and Robustness

  258. PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance

  259. UniCBE: An Uniformity-driven Comparing Based Evaluation Framework with Unified Multi-Objective Optimization

  260. Joint Gradient Balancing for Data Ordering in Finite-Sum Multi-Objective Optimization

  261. Interleaved Scene Graphs for Interleaved Text-and-Image Generation Assessment

  262. Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought

  263. Improving Unsupervised Constituency Parsing via Maximizing Semantic Information

  264. 4K4DGen: Panoramic 4D Generation at 4K Resolution

  265. DEEM: Diffusion models serve as the eyes of large language models for image perception

  266. Demystifying the Token Dynamics of Deep Selective State Space Models

  267. Mitigating Information Loss in Tree-Based Reinforcement Learning via Direct Optimization

  268. GETS: Ensemble Temperature Scaling for Calibration in Graph Neural Networks

  269. Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation

  270. Iterative Label Refinement Matters More than Preference Optimization under Weak Supervision

  271. Retri3D: 3D Neural Graphics Representation Retrieval

  272. CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control

  273. Scalable Decision-Making in Stochastic Environments through Learned Temporal Abstraction

  274. Scalable and Certifiable Graph Unlearning: Overcoming the Approximation Error Barrier

  275. Control-oriented Clustering of Visual Latent Representation

  276. No Need to Talk: Asynchronous Mixture of Language Models

  277. Exploring Local Memorization in Diffusion Models via Bright Ending Attention

  278. Provably Reliable Conformal Prediction Sets in the Presence of Data Poisoning

  279. LLaVA-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models

  280. TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

  281. Not All LLM-Generated Data Are Equal: Rethinking Data Weighting in Text Classification

  282. ZAPBench: A Benchmark for Whole-Brain Activity Prediction in Zebrafish

  283. Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence

  284. Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding

  285. Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures

  286. OS-ATLAS: Foundation Action Model for Generalist GUI Agents

  287. ThinK: Thinner Key Cache by Query-Driven Pruning

  288. Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking

  289. Probabilistic Geometric Principal Component Analysis with application to neural data

  290. CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph

  291. MQuAKE-Remastered: Multi-Hop Knowledge Editing Can Only Be Advanced with Reliable Evaluations

  292. Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model

  293. Boltzmann-Aligned Inverse Folding Model as a Predictor of Mutational Effects on Protein-Protein Interactions

  294. gRNAde: Geometric Deep Learning for 3D RNA inverse design

  295. RESuM: A Rare Event Surrogate Model for Physics Detector Design

  296. NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

  297. Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction

  298. MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion

  299. Online Reinforcement Learning in Non-Stationary Context-Driven Environments

  300. ADIFF: Explaining audio difference using natural language

  301. Controlling Language and Diffusion Models by Transporting Activations

  302. Reti-Diff: Illumination Degradation Image Restoration with Retinex-based Latent Diffusion Model

  303. OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

  304. Diffusion Attribution Score: Evaluating Training Data Influence in Diffusion Models

  305. Learning local equivariant representations for quantum operators

  306. LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation

  307. MetaUrban: An Embodied AI Simulation Platform for Urban Micromobility

  308. INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge

  309. Identifiable Exchangeable Mechanisms for Causal Structure and Representation Learning

  310. Preference Optimization for Reasoning with Pseudo Feedback

  311. Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation

  312. SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning

  313. AnalogGenie: A Generative Engine for Automatic Discovery of Analog Circuit Topologies

  314. PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training

  315. SePer: Measure Retrieval Utility Through The Lens Of Semantic Perplexity Reduction

  316. Recognize Any Surgical Object: Unleashing the Power of Weakly-Supervised Data

  317. Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency

  318. Modeling Complex System Dynamics with Flow Matching Across Time and Conditions

  319. Understanding Factual Recall in Transformers via Associative Memories

  320. MixEval-X: Any-to-any Evaluations from Real-world Data Mixture

  321. Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization

  322. Rethinking and Improving Autoformalization: Towards a Faithful Metric and a Dependency Retrieval-based Approach

  323. MamKO: Mamba-based Koopman operator for modeling and predictive control

  324. Probabilistic Neural Pruning via Sparsity Evolutionary Fokker-Planck-Kolmogorov Equation

  325. Diffusion Bridge AutoEncoders for Unsupervised Representation Learning

  326. Bayesian Experimental Design Via Contrastive Diffusions

  327. Mixture-of-Agents Enhances Large Language Model Capabilities

  328. Uncovering Gaps in How Humans and LLMs Interpret Subjective Language

  329. Improving the Sparse Structure Learning of Spiking Neural Networks from the View of Compression Efficiency

  330. ImpScore: A Learnable Metric For Quantifying The Implicitness Level of Sentences

  331. Representative Guidance: Diffusion Model Sampling with Coherence

  332. LoRA-Pro: Are Low-Rank Adapters Properly Optimized?

  333. Bilinear MLPs enable weight-based mechanistic interpretability

  334. DRoP: Distributionally Robust Data Pruning

  335. Vision Language Models are In-Context Value Learners

  336. Linear Spherical Sliced Optimal Transport: A Fast Metric for Comparing Spherical Data

  337. Conformal Prediction Sets Can Cause Disparate Impact

  338. PhyMPGN: Physics-encoded Message Passing Graph Network for spatiotemporal PDE systems

  339. Generalization Guarantees for Representation Learning via Data-Dependent Gaussian Mixture Priors

  340. Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model

  341. Strong Model Collapse

  342. CBQ: Cross-Block Quantization for Large Language Models

  343. Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts

  344. Reinforcement Learning for Control of Non-Markovian Cellular Population Dynamics

  345. BirdSet: A Large-Scale Dataset for Audio Classification in Avian Bioacoustics

  346. Budgeted Online Continual Learning by Adaptive Layer Freezing and Frequency-based Sampling

  347. Training-Free Activation Sparsity in Large Language Models

  348. How Feature Learning Can Improve Neural Scaling Laws

  349. Beyond Next Token Prediction: Patch-Level Training for Large Language Models

  350. Exact Certification of (Graph) Neural Networks Against Label Poisoning

  351. Decentralized Sporadic Federated Learning: A Unified Algorithmic Framework with Convergence Guarantees

  352. Credal Wrapper of Model Averaging for Uncertainty Estimation in Classification

  353. X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale

  354. TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models

  355. Wasserstein Distances, Neuronal Entanglement, and Sparsity

  356. Geometric Inductive Biases of Deep Networks: The Role of Data and Architecture

  357. Online Preference Alignment for Language Models via Count-based Exploration

  358. Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution

  359. BodyGen: Advancing Towards Efficient Embodiment Co-Design

  360. Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models

  361. One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt

  362. Linear SCM Identification in the Presence of Confounders and Gaussian Noise

  363. AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs

  364. NetFormer: An interpretable model for recovering dynamical connectivity in neuronal population dynamics

  365. MotionAura: Generating High-Quality and Motion Consistent Videos using Discrete Diffusion

  366. Harnessing Diversity for Important Data Selection in Pretraining Large Language Models

  367. Generating Freeform Endoskeletal Robots

  368. DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models

  369. Dense Video Object Captioning from Disjoint Supervision

  370. Targeted Attack Improves Protection against Unauthorized Diffusion Customization

  371. A Geometric Framework for Understanding Memorization in Generative Models

  372. Recovering Manifold Structure Using Ollivier Ricci Curvature

  373. Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book?

  374. Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late In Training

  375. Fine-tuning with Reserved Majority for Noise Reduction

  376. Min-K%++: Improved Baseline for Pre-Training Data Detection from Large Language Models

  377. Nesterov acceleration in benignly non-convex landscapes

  378. Can Large Language Models Understand Symbolic Graphics Programs?

  379. PABBO: Preferential Amortized Black-Box Optimization

  380. Learning Transformer-based World Models with Contrastive Predictive Coding

  381. Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

  382. Competition Dynamics Shape Algorithmic Phases of In-Context Learning

  383. DartControl: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control

  384. Topological Schrödinger Bridge Matching

  385. Learning-Augmented Frequent Directions

  386. Perm: A Parametric Representation for Multi-Style 3D Hair Modeling

  387. Learning to Solve Differential Equation Constrained Optimization Problems

  388. Co$^{\mathbf{3}}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion

  389. Fair Clustering in the Sliding Window Model

  390. Language Model Alignment in Multilingual Trolley Problems

  391. Joint Reward and Policy Learning with Demonstrations and Human Feedback Improves Alignment

  392. Biologically Constrained Barrel Cortex Model Integrates Whisker Inputs and Replicates Key Brain Network Dynamics

  393. Linear Mode Connectivity in Differentiable Tree Ensembles

  394. AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories

  395. SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation

  396. Nonlinear Sequence Embedding by Monotone Variational Inequality

  397. On Disentangled Training for Nonlinear Transform in Learned Image Compression

  398. InverseBench: Benchmarking Plug-and-Play Diffusion Priors for Inverse Problems in Physical Sciences

  399. Approaching Rate-Distortion Limits in Neural Compression with Lattice Transform Coding

  400. Regularization by Texts for Latent Diffusion Inverse Solvers

  401. First-Person Fairness in Chatbots

  402. MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark

  403. Uncertainty Modeling in Graph Neural Networks via Stochastic Differential Equations

  404. Surprising Effectiveness of pretraining Ternary Language Model at Scale

  405. Provable Uncertainty Decomposition via Higher-Order Calibration

  406. TopoNets: High performing vision and language models with brain-like topography

  407. Deep Learning Alternatives Of The Kolmogorov Superposition Theorem

  408. Robustness Reprogramming for Representation Learning

  409. Attention with Markov: A Curious Case of Single-layer Transformers

  410. Higher-Order Graphon Neural Networks: Approximation and Cut Distance

  411. Exploring the Camera Bias of Person Re-identification

  412. Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient

  413. Meta-Dynamical State Space Models for Integrative Neural Data Analysis

  414. Lean-STaR: Learning to Interleave Thinking and Proving

  415. Revisiting Random Walks for Learning on Graphs

  416. Progressive Compositionality in Text-to-Image Generative Models

  417. ODE-based Smoothing Neural Network for Reinforcement Learning Tasks

  418. How to Find the Exact Pareto Front for Multi-Objective MDPs?

  419. Easing Training Process of Rectified Flow Models Via Lengthening Inter-Path Distance

  420. FairMT-Bench: Benchmarking Fairness for Multi-turn Dialogue in Conversational LLMs

  421. SRSA: Skill Retrieval and Adaptation for Robotic Assembly Tasks

  422. Towards General-Purpose Model-Free Reinforcement Learning

  423. The Computational Complexity of Circuit Discovery for Inner Interpretability

  424. Learning from End User Data with Shuffled Differential Privacy over Kernel Densities

  425. VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning

  426. Spectral Compressive Imaging via Unmixing-driven Subspace Diffusion Refinement

  427. Topograph: An Efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation

  428. MorphoDiff: Cellular Morphology Painting with Diffusion Models

  429. LoCoDL: Communication-Efficient Distributed Learning with Local Training and Compression

  430. Let SSMs be ConvNets: State-space Modeling with Optimal Tensor Contractions

  431. SoftCVI: Contrastive variational inference with self-generated soft labels

  432. Adam Exploits $\ell_\infty$-geometry of Loss Landscape via Coordinate-wise Adaptivity

  433. DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life

  434. u-$\mu$P: The Unit-Scaled Maximal Update Parametrization

  435. Instance-dependent Early Stopping

  436. Learning vector fields of differential equations on manifolds with geometrically constrained operator-valued kernels

  437. Programming Refusal with Conditional Activation Steering

  438. Samba: Synchronized Set-of-Sequences Modeling for Multiple Object Tracking

  439. Moner: Motion Correction in Undersampled Radial MRI with Unsupervised Neural Representation

  440. SPA-BENCH: A COMPREHENSIVE BENCHMARK FOR SMARTPHONE AGENT EVALUATION

  441. A Second-Order Perspective on Model Compositionality and Incremental Learning

  442. Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes

  443. Formation of Representations in Neural Networks

  444. RAG-SR: Retrieval-Augmented Generation for Neural Symbolic Regression

  445. Towards Automated Knowledge Integration From Human-Interpretable Representations

  446. Towards Marginal Fairness Sliced Wasserstein Barycenter

  447. How new data permeates LLM knowledge and how to dilute it

  448. TOP-ERL: Transformer-based Off-Policy Episodic Reinforcement Learning

  449. Multi-Draft Speculative Sampling: Canonical Decomposition and Theoretical Limits

  450. Anti-Exposure Bias in Diffusion Models

  451. Continuous Exposure Learning for Low-light Image Enhancement using Neural ODEs

  452. 3DIS: Depth-Driven Decoupled Image Synthesis for Universal Multi-Instance Generation

  453. WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

  454. Mitigating Memorization in Language Models

  455. D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement

  456. Following the Human Thread in Social Navigation

  457. DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes

  458. CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation

  459. A Periodic Bayesian Flow for Material Generation

  460. Generalized Principal-Agent Problem with a Learning Agent

  461. Efficient and Accurate Explanation Estimation with Distribution Compression

  462. Nonlinear multiregion neural dynamics with parametric impulse response communication channels

  463. POTEC: Off-Policy Contextual Bandits for Large Action Spaces via Policy Decomposition

  464. LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation models

  465. Reducing Hallucinations in Large Vision-Language Models via Latent Space Steering

  466. OASIS Uncovers: High-Quality T2I Models, Same Old Stereotypes

  467. TabReD: Analyzing Pitfalls and Filling the Gaps in Tabular Deep Learning Benchmarks

  468. Streaming Algorithms For $\ell_p$ Flows and $\ell_p$ Regression

  469. Simple yet Effective Incomplete Multi-view Clustering: Similarity-level Imputation and Intra-view Hybrid-group Prototype Construction

  470. PaRa: Personalizing Text-to-Image Diffusion via Parameter Rank Reduction

  471. To Trust or Not to Trust? Enhancing Large Language Models' Situated Faithfulness to External Contexts

  472. Active Task Disambiguation with LLMs

  473. Systems with Switching Causal Relations: A Meta-Causal Perspective

  474. Enhancing Learning with Label Differential Privacy by Vector Approximation

  475. Multi-session, multi-task neural decoding from distinct cell-types and brain regions

  476. Sparse components distinguish visual pathways & their alignment to neural networks

  477. Revisiting text-to-image evaluation with Gecko: on metrics, prompts, and human rating

  478. Tell me about yourself: LLMs are aware of their learned behaviors

  479. CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models

  480. Differential learning kinetics govern the transition from memorization to generalization during in-context learning

  481. Implicit Bias of Mirror Flow for Shallow Neural Networks in Univariate Regression

  482. Streamlining Redundant Layers to Compress Large Language Models

  483. SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding

  484. Union-over-Intersections: Object Detection beyond Winner-Takes-All

  485. DLEFT-MKC: Dynamic Late Fusion Multiple Kernel Clustering with Robust Tensor Learning via Min-Max Optimization

  486. Atlas Gaussians Diffusion for 3D Generation

  487. Enhancing Pre-trained Representation Classifiability can Boost its Interpretability

  488. Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models

  489. Presto! Distilling Steps and Layers for Accelerating Music Generation

  490. Grounding Video Models to Actions through Goal Conditioned Exploration

  491. Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws

  492. Realistic Evaluation of Deep Partial-Label Learning Algorithms

  493. EmbedLLM: Learning Compact Representations of Large Language Models

  494. In Search of Forgotten Domain Generalization

  495. Learning Equivariant Non-Local Electron Density Functionals

  496. Differentiable Integer Linear Programming

  497. Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling

  498. Poison-splat: Computation Cost Attack on 3D Gaussian Splatting

  499. Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation

  500. Improved Approximation Algorithms for $k$-Submodular Maximization via Multilinear Extension

  501. AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials

  502. Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences

  503. Scaling FP8 training to trillion-token LLMs

  504. Student-Informed Teacher Training

  505. Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation

  506. Estimating the Probabilities of Rare Outputs in Language Models

  507. Imputation for prediction: beware of diminishing returns.

  508. Physics-aligned field reconstruction with diffusion bridge

  509. Rethinking Reward Model Evaluation: Are We Barking up the Wrong Tree?

  510. $R^2$-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning

  511. Test-time Adaptation for Cross-modal Retrieval with Query Shift

  512. Severing Spurious Correlations with Data Pruning

  513. Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance

  514. Lightweight Neural App Control

  515. DeLLMa: Decision Making Under Uncertainty with Large Language Models

  516. Multi-Robot Motion Planning with Diffusion Models

  517. ConFIG: Towards Conflict-free Training of Physics Informed Neural Networks

  518. MagicPIG: LSH Sampling for Efficient LLM Generation

  519. Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning

  520. Hymba: A Hybrid-head Architecture for Small Language Models

  521. AutoCGP: Closed-Loop Concept-Guided Policies from Unlabeled Demonstrations

  522. A CLIP-Powered Framework for Robust and Generalizable Data Selection

  523. OSDA Agent: Leveraging Large Language Models for De Novo Design of Organic Structure Directing Agents

  524. Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models

  525. Conditional Diffusion with Ordinal Regression: Longitudinal Data Generation for Neurodegenerative Disease Studies

  526. SplatFormer: Point Transformer for Robust 3D Gaussian Splatting

  527. When do GFlowNets learn the right distribution?

  528. On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent

  529. Towards a Unified and Verified Understanding of Group-Operation Networks

  530. DenseMatcher: Learning 3D Semantic Correspondence for Category-Level Manipulation from a Single Demo

  531. Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL

  532. Quality Measures for Dynamic Graph Generative Models

  533. Direct Post-Training Preference Alignment for Multi-Agent Motion Generation Model Using Implicit Feedback from Pre-training Demonstrations

  534. Better Instruction-Following Through Minimum Bayes Risk

  535. LiFT: Learning to Fine-Tune via Bayesian Parameter Efficient Meta Fine-Tuning

  536. Theory on Mixture-of-Experts in Continual Learning

  537. Simplifying Deep Temporal Difference Learning

  538. What Makes a Good Diffusion Planner for Decision Making?

  539. Graph Sparsification via Mixture of Graphs

  540. When Attention Sink Emerges in Language Models: An Empirical View

  541. TabWak: A Watermark for Tabular Diffusion Models

  542. Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition

  543. MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL

  544. Answer, Assemble, Ace: Understanding How LMs Answer Multiple Choice Questions

  545. Scaling up the Banded Matrix Factorization Mechanism for Large Scale Differentially Private ML

  546. BlendRL: A Framework for Merging Symbolic and Neural Policy Learning

  547. COPER: Correlation-based Permutations for Multi-View Clustering

  548. MAGNet: Motif-Agnostic Generation of Molecules from Scaffolds

  549. RegMix: Data Mixture as Regression for Language Model Pre-training

  550. Enhancing Compositional Text-to-Image Generation with Reliable Random Seeds

  551. Walk the Talk? Measuring the Faithfulness of Large Language Model Explanations

  552. Accelerating Goal-Conditioned Reinforcement Learning Algorithms and Research

  553. Beyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks

  554. ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction

  555. Learning from negative feedback, or positive feedback or both

  556. Planning in Natural Language Improves LLM Search for Code Generation

  557. On Quantizing Neural Representation for Variable-Rate Video Coding

  558. DiffPuter: Empowering Diffusion Models for Missing Data Imputation

  559. What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis

  560. Broaden your SCOPE! Efficient Multi-turn Conversation Planning for LLMs with Semantic Space

  561. LeFusion: Controllable Pathology Synthesis via Lesion-Focused Diffusion Models

  562. Multi-Field Adaptive Retrieval

  563. RelitLRM: Generative Relightable Radiance for Large Reconstruction Models

  564. Learning Spatiotemporal Dynamical Systems from Point Process Observations

  565. uniINF: Best-of-Both-Worlds Algorithm for Parameter-Free Heavy-Tailed MABs

  566. The Superposition of Diffusion Models Using the Itô Density Estimator

  567. Towards hyperparameter-free optimization with differential privacy

  568. Discovering Temporally Compositional Neural Manifolds with Switching Infinite GPFA

  569. DeepRTL: Bridging Verilog Understanding and Generation with a Unified Representation Model

  570. DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference

  571. CREIMBO: Cross-Regional Ensemble Interactions in Multi-view Brain Observations

  572. High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws

  573. NetMoE: Accelerating MoE Training through Dynamic Sample Placement

  574. Bayesian Optimization via Continual Variational Last Layer Training

  575. Fast Uncovering of Protein Sequence Diversity from Structure

  576. Century: A Framework and Dataset for Evaluating Historical Contextualisation of Sensitive Images

  577. MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code

  578. OmniRe: Omni Urban Scene Reconstruction

  579. In vivo cell-type and brain region classification via multimodal contrastive learning

  580. Monitoring Latent World States in Language Models with Propositional Probes

  581. Universal generalization guarantees for Wasserstein distributionally robust models

  582. PETRA: Parallel End-to-end Training with Reversible Architectures

  583. ThunderKittens: Simple, Fast, and $\textit{Adorable}$ Kernels

  584. Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage

  585. Holistically Evaluating the Environmental Impact of Creating Language Models

  586. Adaptive Gradient Clipping for Robust Federated Learning

  587. DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback

  588. Re-Imagining Multimodal Instruction Tuning: A Representation View

  589. Inverse decision-making using neural amortized Bayesian actors

  590. Designing Concise ConvNets with Columnar Stages

  591. MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow

  592. Let the Code LLM Edit Itself When You Edit the Code

  593. Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset

  594. Generalizing Reasoning Problems to Longer Lengths

  595. Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

  596. Understanding the Stability-based Generalization of Personalized Federated Learning

  597. IgGM: A Generative Model for Functional Antibody and Nanobody Design

  598. MMTEB: Massive Multilingual Text Embedding Benchmark

  599. Ultra-Sparse Memory Network

  600. Lines of Thought in Large Language Models

  601. Do You Keep an Eye on What I Ask? Mitigating Multimodal Hallucination via Attention-Guided Ensemble Decoding

  602. Exact Community Recovery under Side Information: Optimality of Spectral Algorithms

  603. Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHR Data

  604. Deconstructing What Makes a Good Optimizer for Autoregressive Language Models

  605. Dataset Ownership Verification in Contrastive Pre-trained Models

  606. System 1.x: Learning to Balance Fast and Slow Planning with Language Models

  607. Time-to-Event Pretraining for 3D Medical Imaging

  608. Semialgebraic Neural Networks: From roots to representations

  609. h4rm3l: A Language for Composable Jailbreak Attack Synthesis

  610. Synthesizing Realistic fMRI: A Physiological Dynamics-Driven Hierarchical Diffusion Model for Efficient fMRI Acquisition

  611. Semantic Temporal Abstraction via Vision-Language Model Guidance for Efficient Reinforcement Learning

  612. Shared-AE: Automatic Identification of Shared Subspaces in High-dimensional Neural and Behavioral Activity

  613. To Code or Not To Code? Exploring Impact of Code in Pre-training

  614. Feature Averaging: An Implicit Bias of Gradient Descent Leading to Non-Robustness in Neural Networks

  615. Enhancing Clustered Federated Learning: Integration of Strategies and Improved Methodologies

  616. Unified Parameter-Efficient Unlearning for LLMs

  617. FreSh: Frequency Shifting for Accelerated Neural Representation Learning

  618. EvA: Erasing Spurious Correlations with Activations

  619. Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching

  620. RocketEval: Efficient automated LLM evaluation via grading checklist

  621. Zero-cost Proxy for Adversarial Robustness Evaluation

  622. A Skewness-Based Criterion for Addressing Heteroscedastic Noise in Causal Discovery

  623. Exact Byte-Level Probabilities from Tokenized Language Models for FIM-Tasks and Model Ensembles

  624. Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs

  625. VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents

  626. Differentiable Rule Induction from Raw Sequence Inputs

  627. NVS-Solver: Video Diffusion Model as Zero-Shot Novel View Synthesizer

  628. Group Ligands Docking to Protein Pockets

  629. Do Stochastic, Feel Noiseless: Stable Stochastic Optimization via a Double Momentum Mechanism

  630. SSOLE: Rethinking Orthogonal Low-rank Embedding for Self-Supervised Learning

  631. EgoSim: Egocentric Exploration in Virtual Worlds with Multi-modal Conditioning

  632. Deep Kernel Relative Test for Machine-generated Text Detection

  633. Random Is All You Need: Random Noise Injection on Feature Statistics for Generalizable Deep Image Denoising

  634. GOAL: A Generalist Combinatorial Optimization Agent Learner

  635. Integral Performance Approximation for Continuous-Time Reinforcement Learning Control

  636. FLOPS: Forward Learning with OPtimal Sampling

  637. Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaptation

  638. Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers

  639. LICO: Large Language Models for In-Context Molecular Optimization

  640. SiReRAG: Indexing Similar and Related Information for Multihop Reasoning

  641. AutoBencher: Towards Declarative Benchmark Construction

  642. DenoiseVAE: Learning Molecule-Adaptive Noise Distributions for Denoising-based 3D Molecular Pre-training

  643. ELFS: Label-Free Coreset Selection with Proxy Training Dynamics

  644. Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation

  645. UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation

  646. Forewarned is Forearmed: Harnessing LLMs for Data Synthesis via Failure-induced Exploration

  647. Effective and Efficient Time-Varying Counterfactual Prediction with State-Space Models

  648. On the expressiveness and spectral bias of KANs

  649. Federated Class-Incremental Learning: A Hybrid Approach Using Latent Exemplars and Data-Free Techniques to Address Local and Global Forgetting

  650. AIMS.au: A Dataset for the Analysis of Modern Slavery Countermeasures in Corporate Statements

  651. ThermalGaussian: Thermal 3D Gaussian Splatting

  652. Scaling Wearable Foundation Models

  653. Omni-MATH: A Universal Olympiad Level Mathematic Benchmark for Large Language Models

  654. Language-Image Models with 3D Understanding

  655. NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models

  656. Generative Monoculture in Large Language Models

  657. Accelerating Diffusion Transformers with Token-wise Feature Caching

  658. Point-SAM: Promptable 3D Segmentation Model for Point Clouds

  659. Towards Understanding the Universality of Transformers for Next-Token Prediction

  660. Disentangling Representations through Multi-task Learning

  661. APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding

  662. Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Datasets

  663. Causally Motivated Sycophancy Mitigation for Large Language Models

  664. Understanding and Enhancing Safety Mechanisms of LLMs via Safety-Specific Neuron

  665. Pacmann: Efficient Private Approximate Nearest Neighbor Search

  666. Understanding the Generalization of In-Context Learning in Transformers: An Empirical Study

  667. ReCogLab: a framework testing relational reasoning & cognitive hypotheses on LLMs

  668. MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models

  669. HERO: Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning

  670. On the Price of Differential Privacy for Hierarchical Clustering

  671. Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers

  672. A Unified Framework for Forward and Inverse Problems in Subsurface Imaging using Latent Space Translations

  673. Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding

  674. Image and Video Tokenization with Binary Spherical Quantization

  675. Mitigating Object Hallucination in MLLMs via Data-augmented Phrase-level Alignment

  676. Simple, Good, Fast: Self-Supervised World Models Free of Baggage

  677. MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation

  678. UniCO: On Unified Combinatorial Optimization via Problem Reduction to Matrix-Encoded General TSP

  679. Grammar Reinforcement Learning: path and cycle counting in graphs with a Context-Free Grammar and Transformer approach

  680. GPromptShield: Elevating Resilience in Graph Prompt Tuning Against Adversarial Attacks

  681. WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling

  682. RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation

  683. Quality over Quantity in Attention Layers: When Adding More Heads Hurts

  684. Language Agents Meet Causality -- Bridging LLMs and Causal World Models

  685. Sort-free Gaussian Splatting via Weighted Sum Rendering

  686. Agent-to-Sim: Learning Interactive Behavior Models from Casual Longitudinal Videos

  687. PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations

  688. Boosting Latent Diffusion with Perceptual Objectives

  689. HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

  690. PALMBENCH: A COMPREHENSIVE BENCHMARK OF COMPRESSED LARGE LANGUAGE MODELS ON MOBILE PLATFORMS

  691. Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

  692. InstantSplamp: Fast and Generalizable Stenography Framework for Generative Gaussian Splatting

  693. PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

  694. EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models

  695. SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models

  696. Feature-Based Online Bilateral Trade

  697. MaxCutPool: differentiable feature-aware Maxcut for pooling in graph neural networks

  698. Correlating instruction-tuning (in multimodal models) with vision-language processing (in the brain)

  699. SafeWatch: An Efficient Safety-Policy Following Video Guardrail Model with Transparent Explanations

  700. Revisiting In-context Learning Inference Circuit in Large Language Models

  701. Optimistic Games for Combinatorial Bayesian Optimization with Application to Protein Design

  702. Vector-ICL: In-context Learning with Continuous Vector Representations

  703. A Generic Framework for Conformal Fairness

  704. Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Models

  705. MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

  706. Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting

  707. Deep Signature: Characterization of Large-Scale Molecular Dynamics

  708. Diffusion Models are Evolutionary Algorithms

  709. Structural-Entropy-Based Sample Selection for Efficient and Effective Learning

  710. Consistency Models Made Easy

  711. Context Steering: Controllable Personalization at Inference Time

  712. Reflective Gaussian Splatting

  713. Optimal Transport for Time Series Imputation

  714. UniRestore3D: A Scalable Framework For General Shape Restoration

  715. Endless Jailbreaks with Bijection Learning

  716. Learning General-purpose Biomedical Volume Representations using Randomized Synthesis

  717. Discrete Distribution Networks

  718. EIA: ENVIRONMENTAL INJECTION ATTACK ON GENERALIST WEB AGENTS FOR PRIVACY LEAKAGE

  719. SeCom: On Memory Construction and Retrieval for Personalized Conversational Agents

  720. Language Models Learn to Mislead Humans via RLHF

  721. Enhancing Uncertainty Estimation and Interpretability with Bayesian Non-negative Decision Layer

  722. MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy Spectra

  723. Building, Reusing, and Generalizing Abstract Representations from Concrete Sequences

  724. (Mis)Fitting Scaling Laws: A Survey of Scaling Law Fitting Techniques in Deep Learning

  725. A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or Subgoals

  726. Isometric Regularization for Manifolds of Functional Data

  727. Beware of Calibration Data for Pruning Large Language Models

  728. SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP

  729. Active Learning for Neural PDE Solvers

  730. Which Tasks Should Be Compressed Together? A Causal Discovery Approach for Efficient Multi-Task Representation Compression

  731. SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models

  732. MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference Optimization

  733. Advancing Graph Generation through Beta Diffusion

  734. Scalable Universal T-Cell Receptor Embeddings from Adaptive Immune Repertoires

  735. ZeroDiff: Solidified Visual-semantic Correlation in Zero-Shot Learning

  736. Self-Supervised Diffusion MRI Denoising via Iterative and Stable Refinement

  737. BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks

  738. Adaptive Methods through the Lens of SDEs: Theoretical Insights on the Role of Noise

  739. Failures to Find Transferable Image Jailbreaks Between Vision-Language Models

  740. Learning Evolving Tools for Large Language Models

  741. Feature Responsiveness Scores: Model-Agnostic Explanations for Recourse

  742. A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation

  743. GaussianBlock: Building Part-Aware Compositional and Editable 3D Scene by Primitives and Gaussians

  744. Improving Instruction-Following in Language Models through Activation Steering

  745. Scaling Autonomous Agents via Automatic Reward Modeling And Planning

  746. Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation

  747. PFDiff: Training-Free Acceleration of Diffusion Models Combining Past and Future Scores

  748. Learning to Communicate Through Implicit Communication Channels

  749. RESfM: Robust Deep Equivariant Structure from Motion

  750. Video In-context Learning: Autoregressive Transformers are Zero-Shot Video Imitators

  751. ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains

  752. Aligning Visual Contrastive learning models via Preference Optimization

  753. MGDA Converges under Generalized Smoothness, Provably

  754. Making Text Embedders Few-Shot Learners

  755. A Solvable Attention for Neural Scaling Laws

  756. ST-GCond: Self-supervised and Transferable Graph Dataset Condensation

  757. Learning Successor Features with Distributed Hebbian Temporal Memory

  758. Inspection and Control of Self-Generated-Text Recognition Ability in Llama3-8b-Instruct

  759. When GNNs meet symmetry in ILPs: an orbit-based feature augmentation approach

  760. SINGER: Stochastic Network Graph Evolving Operator for High Dimensional PDEs

  761. FlashMask: Efficient and Rich Mask Extension of FlashAttention

  762. Towards Effective Evaluations and Comparisons for LLM Unlearning Methods

  763. On Calibration of LLM-based Guard Models for Reliable Content Moderation

  764. TimeKAN: KAN-based Frequency Decomposition Learning Architecture for Long-term Time Series Forecasting

  765. SBSC: Step-by-Step Coding for Improving Mathematical Olympiad Performance

  766. NRGBoost: Energy-Based Generative Boosted Trees

  767. Process Reward Model with Q-value Rankings

  768. LLM-based Typed Hyperresolution for Commonsense Reasoning with Knowledge Bases

  769. Credit-based self organizing maps: training deep topographic networks with minimal performance degradation

  770. Improved Algorithms for Kernel Matrix-Vector Multiplication Under Sparsity Assumptions

  771. LancBiO: Dynamic Lanczos-aided Bilevel Optimization via Krylov Subspace

  772. Needle Threading: Can LLMs Follow Threads Through Near-Million-Scale Haystacks?

  773. Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models

  774. Learning Fine-Grained Representations through Textual Token Disentanglement in Composed Video Retrieval

  775. SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation

  776. Tree of Attributes Prompt Learning for Vision-Language Models

  777. Expected Return Symmetries

  778. Intrinsic User-Centric Interpretability through Global Mixture of Experts

  779. LongVILA: Scaling Long-Context Visual Language Models for Long Videos

  780. Is Large-scale Pretraining the Secret to Good Domain Generalization?

  781. Modeling dynamic social vision highlights gaps between deep learning and humans

  782. Efficient Low-Bit Quantization with Adaptive Scales for Multi-Task Co-Training

  783. Hierarchical Uncertainty Estimation for Learning-based Registration in Neuroimaging

  784. Counterfactual Concept Bottleneck Models

  785. PIED: Physics-Informed Experimental Design for Inverse Problems

  786. In-Context Editing: Learning Knowledge from Self-Induced Distributions

  787. To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

  788. Field-DiT: Diffusion Transformer on Unified Video, 3D, and Game Field Generation

  789. Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval-Augmented Generation

  790. Growth Inhibitors for Suppressing Inappropriate Image Concepts in Diffusion Models

  791. Overcoming Slow Decision Frequencies in Continuous Control: Model-Based Sequence Reinforcement Learning for Model-Free Control

  792. DistillHGNN: A Knowledge Distillation Approach for High-Speed Hypergraph Neural Networks

  793. DiscoveryBench: Towards Data-Driven Discovery with Large Language Models

  794. Image-level Memorization Detection via Inversion-based Inference Perturbation

  795. Youku Dense Caption: A Large-scale Chinese Video Dense Caption Dataset and Benchmarks

  796. Benchmarking Agentic Workflow Generation

  797. MAGE: Model-Level Graph Neural Networks Explanations via Motif-based Graph Generation

  798. Routing Experts: Learning to Route Dynamic Experts in Existing Multi-modal Large Language Models

  799. Difference-of-submodular Bregman Divergence

  800. Param$\Delta$ for Direct Mixing: Post-Train Large Language Model At Zero Cost

  801. On the Modeling Capabilities of Large Language Models for Sequential Decision Making

  802. Training-free LLM-generated Text Detection by Mining Token Probability Sequences

  803. Revolutionizing EMCCD Denoising through a Novel Physics-Based Learning Framework for Noise Modeling

  804. $F^3Set$: Towards Analyzing Fast, Frequent, and Fine-grained Events from Videos

  805. IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning

  806. Text4Seg: Reimagining Image Segmentation as Text Generation

  807. Linear Multistep Solver Distillation for Fast Sampling of Diffusion Models

  808. Scalable Extraction of Training Data from Aligned, Production Language Models

  809. DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking head Video Generation

  810. SWEb: A Large Web Dataset for the Scandinavian Languages

  811. High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity

  812. MindSimulator: Exploring Brain Concept Localization via Synthetic fMRI

  813. Neural Approximate Mirror Maps for Constrained Diffusion Models

  814. Towards Empowerment Gain through Causal Structure Learning in Model-Based Reinforcement Learning

  815. Approximating Full Conformal Prediction for Neural Network Regression with Gauss-Newton Influence

  816. VoxDialogue: Can Spoken Dialogue Systems Understand Information Beyond Words?

  817. Learning to Explore and Exploit with GNNs for Unsupervised Combinatorial Optimization

  818. Differentiable Optimization of Similarity Scores Between Models and Brains

  819. Tracing Representation Progression: Analyzing and Enhancing Layer-Wise Similarity

  820. The Pitfalls of Memorization: When Memorization Hurts Generalization

  821. RecFlow: An Industrial Full Flow Recommendation Dataset

  822. DisEnvisioner: Disentangled and Enriched Visual Prompt for Customized Image Generation

  823. ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination

  824. Scaling Laws for Downstream Task Performance in Machine Translation

  825. Stochastic Bandits Robust to Adversarial Attacks

  826. Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries

  827. KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models

  828. Long-tailed Adversarial Training with Self-Distillation

  829. Latent Radiance Fields with 3D-aware 2D Representations

  830. Learning View-invariant World Models for Visual Robotic Manipulation

  831. Memory Efficient Transformer Adapter for Dense Predictions

  832. Logic-Logit: A Logic-Based Approach to Choice Modeling

  833. The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs

  834. Neuron Platonic Intrinsic Representation From Dynamics Using Contrastive Learning

  835. Beyond Canonicalization: How Tensorial Messages Improve Equivariant Message Passing

  836. OccProphet: Pushing the Efficiency Frontier of Camera-Only 4D Occupancy Forecasting with an Observer-Forecaster-Refiner Framework

  837. PINP: Physics-Informed Neural Predictor with latent estimation of fluid flows

  838. Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning

  839. MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge

  840. Multimodal Quantitative Language for Generative Recommendation

  841. Does SGD really happen in tiny subspaces?

  842. Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning

  843. PostCast: Generalizable Postprocessing for Precipitation Nowcasting via Unsupervised Blurriness Modeling

  844. Decoupled Subgraph Federated Learning

  845. AniSDF: Fused-Granularity Neural Surfaces with Anisotropic Encoding for High-Fidelity 3D Reconstruction

  846. Schur's Positive-Definite Network: Deep Learning in the SPD cone with structure

  847. Selective Attention Improves Transformer

  848. VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking

  849. Reconciling Model Multiplicity for Downstream Decision Making

  850. Unbounded: A Generative Infinite Game of Character Life Simulation

  851. Flow Matching with Gaussian Process Priors for Probabilistic Time Series Forecasting

  852. FlowDec: A flow-based full-band general audio codec with high perceptual quality

  853. IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations

  854. Towards Certification of Uncertainty Calibration under Adversarial Attacks

  855. VAE-Var: Variational Autoencoder-Enhanced Variational Methods for Data Assimilation in Meteorology

  856. TODO: Enhancing LLM Alignment with Ternary Preferences

  857. Deep Kernel Posterior Learning under Infinite Variance Prior Weights

  858. Latent-EnSF: A Latent Ensemble Score Filter for High-Dimensional Data Assimilation with Sparse Observation Data

  859. Trajectory-Class-Aware Multi-Agent Reinforcement Learning

  860. EG4D: Explicit Generation of 4D Object without Score Distillation

  861. The impact of allocation strategies in subset learning on the expressive power of neural networks

  862. Learning on One Mode: Addressing Multi-modality in Offline Reinforcement Learning

  863. Weak-to-Strong Generalization Through the Data-Centric Lens

  864. VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks

  865. OmniKV: Dynamic Context Selection for Efficient Long-Context LLMs

  866. Functional Homotopy: Smoothing Discrete Optimization via Continuous Parameters for LLM Jailbreak Attacks

  867. ScImage: How good are multimodal large language models at scientific text-to-image generation?

  868. Discriminating image representations with principal distortions

  869. Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning

  870. The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws

  871. Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization

  872. Data Distillation for extrapolative protein design through exact preference optimization

  873. Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models

  874. Zero-shot Model-based Reinforcement Learning using Large Language Models

  875. Beyond Autoregression: Fast LLMs via Self-Distillation Through Time

  876. LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation

  877. Learning Splitting Heuristics in Divide-and-Conquer SAT Solvers with Reinforcement Learning

  878. ReNovo: Retrieval-Based \emph{De Novo} Mass Spectrometry Peptide Sequencing

  879. SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation

  880. CryoGEN: Generative Energy-based Models for Cryogenic Electron Tomography Reconstruction

  881. Lift Your Molecules: Molecular Graph Generation in Latent Euclidean Space

  882. Score-based Self-supervised MRI Denoising

  883. NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens

  884. Efficiently Parameterized Neural Metriplectic Systems

  885. UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models

  886. SeRA: Self-Reviewing and Alignment of LLMs using Implicit Reward Margins

  887. Dimension Agnostic Neural Processes

  888. Benchmarking LLMs' Judgments with No Gold Standard

  889. Learning Partial Graph Matching via Optimal Partial Transport

  890. Towards Neural Scaling Laws for Time Series Foundation Models

  891. Compositional simulation-based inference for time series

  892. Jailbreaking as a Reward Misspecification Problem

  893. Equivariant Neural Functional Networks for Transformers

  894. Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters

  895. Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View

  896. Deriving Causal Order from Single-Variable Interventions: Guarantees & Algorithm

  897. SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators

  898. Context-aware Dynamic Pruning for Speech Foundation Models

  899. On the Performance Analysis of Momentum Method: A Frequency Domain Perspective

  900. PEARL: Towards Permutation-Resilient LLMs

  901. RAPID: Retrieval Augmented Training of Differentially Private Diffusion Models

  902. Near-Exact Privacy Amplification for Matrix Mechanisms

  903. A Coefficient Makes SVRG Effective

  904. Cross-Embodiment Dexterous Grasping with Reinforcement Learning

  905. On Generalization Across Environments In Multi-Objective Reinforcement Learning

  906. Composable Interventions for Language Models

  907. Bounds on $L_p$ Errors in Density Ratio Estimation via $f$-Divergence Loss Functions

  908. Interpretable Vision-Language Survival Analysis with Ordinal Inductive Bias for Computational Pathology

  909. Accelerating Neural ODEs: A Variational Formulation-based Approach

  910. RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization

  911. Physics-Informed Diffusion Models

  912. ProAdvPrompter: A Two-Stage Journey to Effective Adversarial Prompting for LLMs

  913. Improving Large Language Model Planning with Action Sequence Similarity

  914. Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation

  915. Heavy-Tailed Diffusion Models

  916. Rethinking Multiple-Instance Learning From Feature Space to Probability Space

  917. KGARevion: An AI Agent for Knowledge-Intensive Biomedical QA

  918. DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search

  919. Learning the Complexity of Weakly Noisy Quantum States

  920. RazorAttention: Efficient KV Cache Compression Through Retrieval Heads

  921. An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

  922. BAMDP Shaping: a Unified Framework for Intrinsic Motivation and Reward Shaping

  923. Rationalizing and Augmenting Dynamic Graph Neural Networks

  924. Training-Free Diffusion Model Alignment with Sampling Demons

  925. Stochastic Semi-Gradient Descent for Learning Mean Field Games with Population-Aware Function Approximation

  926. Scaling Long Context Training Data by Long-Distance Referrals

  927. Gradient-Free Generation for Hard-Constrained Systems

  928. JPEG Inspired Deep Learning

  929. A Theory for Token-Level Harmonization in Retrieval-Augmented Generation

  930. Dynamic Diffusion Transformer

  931. Backdooring Vision-Language Models with Out-Of-Distribution Data

  932. Fantastic Targets for Concept Erasure in Diffusion Models and Where To Find Them

  933. MIRAGE: Evaluating and Explaining Inductive Reasoning Process in Language Models

  934. COFlowNet: Conservative Constraints on Flows Enable High-Quality Candidate Generation

  935. Precedence-Constrained Winter Value for Effective Graph Data Valuation

  936. Hierarchical Autoregressive Transformers: Combining Byte- and Word-Level Processing for Robust, Adaptable Language Models

  937. AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark

  938. Self-Correcting Decoding with Generative Feedback for Mitigating Hallucinations in Large Vision-Language Models

  939. MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos

  940. Discrete Diffusion Schrödinger Bridge Matching for Graph Transformation

  941. PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation

  942. Asymptotic Analysis of Two-Layer Neural Networks after One Gradient Step under Gaussian Mixtures Data with Structure

  943. GLOMA: Global Video Text Spotting with Morphological Association

  944. SplineGS: Learning Smooth Trajectories in Gaussian Splatting for Dynamic Scene Reconstruction

  945. Diffusion Feedback Helps CLIP See Better

  946. SV4D: Dynamic 3D Content Generation with Multi-Frame and Multi-View Consistency

  947. MiniPLM: Knowledge Distillation for Pre-training Language Models

  948. Holographic Node Representations: Pre-training Task-Agnostic Node Embeddings

  949. ElasticTok: Adaptive Tokenization for Image and Video

  950. ThinkBot: Embodied Instruction Following with Thought Chain Reasoning

  951. Studying the Interplay Between the Actor and Critic Representations in Reinforcement Learning

  952. Quantum (Inspired) $D^2$-sampling with Applications

  953. Adversarial Generative Flow Network for Solving Vehicle Routing Problems

  954. A Simple Approach to Unifying Diffusion-based Conditional Generation

  955. Node Identifiers: Compact, Discrete Representations for Efficient Graph Learning

  956. Lightning-Fast Image Inversion and Editing for Text-to-Image Diffusion Models

  957. Automated Design of Agentic Systems

  958. Adversarially Robust Anomaly Detection through Spurious Negative Pair Mitigation

  959. Sparse Learning for State Space Models on Mobile

  960. Breaking Mental Set to Improve Reasoning through Diverse Multi-Agent Debate

  961. On the Benefits of Attribute-Driven Graph Domain Adaptation

  962. Enhance Multi-View Classification Through Multi-Scale Alignment and Expanded Boundary

  963. Context-Alignment: Activating and Enhancing LLMs Capabilities in Time Series

  964. U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models

  965. SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix

  966. TabDiff: a Mixed-type Diffusion Model for Tabular Data Generation

  967. Adversarial Machine Unlearning

  968. Adding Conditional Control to Diffusion Models with Reinforcement Learning

  969. How efficient is LLM-generated code? A rigorous & high-standard benchmark

  970. Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction

  971. Better than Your Teacher: LLM Agents that learn from Privileged AI Feedback

  972. Reward Dimension Reduction for Scalable Multi-Objective Reinforcement Learning

  973. Provable Robust Overfitting Mitigation in Wasserstein Distributionally Robust Optimization

  974. Flow-based Variational Mutual Information: Fast and Flexible Approximations

  975. Towards Learning High-Precision Least Squares Algorithms with Sequence Models

  976. MetaMetrics: Calibrating Metrics for Generation Tasks Using Human Preferences

  977. Neural Spacetimes for DAG Representation Learning

  978. Instructional Segment Embedding: Improving LLM Safety with Instruction Hierarchy

  979. BTBS-LNS: Binarized-Tightening, Branch and Search on Learning LNS Policies for MIP

  980. Semantix: An Energy-guided Sampler for Semantic Style Transfer

  981. Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective

  982. Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix

  983. JetFormer: An autoregressive generative model of raw images and text

  984. FreeCG: Free the Design Space of Clebsch-Gordan Transform for Machine Learning Force Fields

  985. PiCO: Peer Review in LLMs based on Consistency Optimization

  986. Density estimation with LLMs: a geometric investigation of in-context learning trajectories

  987. nGPT: Normalized Transformer with Representation Learning on the Hypersphere

  988. Collaborative Discrete-Continuous Black-Box Prompt Learning for Language Models

  989. pMoE: Prompting Diverse Experts Together Wins More in Visual Adaptation

  990. CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation

  991. C-CLIP: Multimodal Continual Learning for Vision-Language Model

  992. Offline Model-Based Optimization by Learning to Rank

  993. Autocorrelation Matters: Understanding the Role of Initialization Schemes for State Space Models

  994. Aioli: A Unified Optimization Framework for Language Model Data Mixing

  995. Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models

  996. Implicit Neural Surface Deformation with Explicit Velocity Fields

  997. Efficient Masked AutoEncoder for Video Object Counting and A Large-Scale Benchmark

  998. Data-adaptive Differentially Private Prompt Synthesis for In-Context Learning

  999. 6DGS: Enhanced Direction-Aware Gaussian Splatting for Volumetric Rendering

  1000. One Model Transfer to All: On Robust Jailbreak Prompts Generation against LLMs

  1001. Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance

  1002. MAST: model-agnostic sparsified training

  1003. Group Downsampling with Equivariant Anti-aliasing

  1004. Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data

  1005. Flow: Modularized Agentic Workflow Automation

  1006. Energy-Based Diffusion Language Models for Text Generation

  1007. Understanding Optimization in Deep Learning with Central Flows

  1008. Temporal Reasoning Transfer from Text to Video

  1009. Your Weak LLM is Secretly a Strong Teacher for Alignment

  1010. ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation

  1011. Adaptive Energy Alignment for Accelerating Test-Time Adaptation

  1012. Partial Gromov-Wasserstein Metric

  1013. Personalized Visual Instruction Tuning

  1014. Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model

  1015. DRoC: Elevating Large Language Models for Complex Vehicle Routing via Decomposed Retrieval of Constraints

  1016. Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

  1017. Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models

  1018. Robust LLM safeguarding via refusal feature adversarial training

  1019. MMed-RAG: Versatile Multimodal RAG System for Medical Vision Language Models

  1020. What Are Good Positional Encodings for Directed Graphs?

  1021. Deep Incomplete Multi-view Learning via Cyclic Permutation of VAEs

  1022. Language Models Need Inductive Biases to Count Inductively

  1023. Articulate-Anything: Automatic Modeling of Articulated Objects via a Vision-Language Foundation Model

  1024. Jailbreak Antidote: Runtime Safety-Utility Balance via Sparse Representation Adjustment in Large Language Models

  1025. Swing-by Dynamics in Concept Learning and Compositional Generalization

  1026. An Evolved Universal Transformer Memory

  1027. On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning

  1028. VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded Text

  1029. TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation

  1030. Minimal Impact ControlNet: Advancing Multi-ControlNet Integration

  1031. Beyond the convexity assumption: Realistic tabular data generation under quantifier-free real linear constraints

  1032. Training-Free Dataset Pruning for Instance Segmentation

  1033. From Probability to Counterfactuals: the Increasing Complexity of Satisfiability in Pearl's Causal Hierarchy

  1034. Efficient Alternating Minimization with Applications to Weighted Low Rank Approximation

  1035. Robust Transfer of Safety-Constrained Reinforcement Learning Agents

  1036. Online-to-Offline RL for Agent Alignment

  1037. Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models

  1038. Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling

  1039. ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments

  1040. {$\tau$}-bench: A Benchmark for \underline{T}ool-\underline{A}gent-\underline{U}ser Interaction in Real-World Domains

  1041. Bayesian Image Regression with Soft-thresholded Conditional Autoregressive Prior

  1042. Beyond Mere Token Analysis: A Hypergraph Metric Space Framework for Defending Against Socially Engineered LLM Attacks

  1043. OCEAN: Offline Chain-of-thought Evaluation and Alignment in Large Language Models

  1044. Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation

  1045. Learning system dynamics without forgetting

  1046. UIFace: Unleashing Inherent Model Capabilities to Enhance Intra-Class Diversity in Synthetic Face Recognition

  1047. SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?

  1048. Optimal Protocols for Continual Learning via Statistical Physics and Control Theory

  1049. DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image

  1050. High-quality Text-to-3D Character Generation with SparseCubes and Sparse Transformers.

  1051. Broadening Target Distributions for Accelerated Diffusion Models via a Novel Analysis Approach

  1052. Warm Diffusion: Recipe for Blur-Noise Mixture Diffusion Models

  1053. CircuitFusion: Multimodal Circuit Representation Learning for Agile Chip Design

  1054. Lossy Compression with Pretrained Diffusion Models

  1055. MoLEx: Mixture of Layer Experts for Fine-tuning with Sparse Upcycling

  1056. PaLD: Detection of Text Partially Written by Large Language Models

  1057. Graph Transformers Dream of Electric Flow

  1058. $InterLCM$: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration

  1059. Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology

  1060. Mechanism and Emergence of Stacked Attention Heads in Multi-Layer Transformers

  1061. PhyloLM: Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks

  1062. Reasoning of Large Language Models over Knowledge Graphs with Super-Relations

  1063. Biologically Plausible Brain Graph Transformer

  1064. Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning

  1065. A Causal Lens for Learning Long-term Fair Policies

  1066. Smoothing the Shift: Towards Stable Test-Time Adaptation under Complex Multimodal Noises

  1067. Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception

  1068. N-ForGOT: Towards Not-forgetting and Generalization of Open Temporal Graph Learning

  1069. Systematic Outliers in Large Language Models

  1070. Protecting against simultaneous data poisoning attacks

  1071. SEMDICE: Off-policy State Entropy Maximization via Stationary Distribution Correction Estimation

  1072. Generating Likely Counterfactuals Using Sum-Product Networks

  1073. Metric-Driven Attributions for Vision Transformers

  1074. Greener GRASS: Enhancing GNNs with Encoding, Rewiring, and Attention

  1075. TS-LIF: A Temporal Segment Spiking Neuron Network for Time Series Forecasting

  1076. Exposure Bracketing Is All You Need For A High-Quality Image

  1077. Semi-Supervised Vision-Centric 3D Occupancy World Model for Autonomous Driving

  1078. Enhanced Diffusion Sampling via Extrapolation with Multiple ODE Solutions

  1079. A Statistical Framework for Ranking LLM-based Chatbots

  1080. OSTQuant: Refining Large Language Model Quantization with Orthogonal and Scaling Transformations for Better Distribution Fitting

  1081. TULIP: Token-length Upgraded CLIP

  1082. Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study

  1083. Gated Delta Networks: Improving Mamba2 with Delta Rule

  1084. Global Well-posedness and Convergence Analysis of Score-based Generative Models via Sharp Lipschitz Estimates

  1085. Neuralized Markov Random Field for Interaction-Aware Stochastic Human Trajectory Prediction

  1086. Monte Carlo Planning with Large Language Model for Text-Based Game Agents

  1087. What's the Move? Hybrid Imitation Learning via Salient Points

  1088. Breaking the Reclustering Barrier in Centroid-based Deep Clustering

  1089. Spiking Vision Transformer with Saccadic Attention

  1090. Balancing Bias in Two-sided Markets for Fair Stable Matchings

  1091. Locality Alignment Improves Vision-Language Models

  1092. ILLUSION: Unveiling Truth with a Comprehensive Multi-Modal, Multi-Lingual Deepfake Dataset

  1093. Do LLM Agents Have Regret? A Case Study in Online Learning and Games

  1094. Bridging the Semantic Gap Between Text and Table: A Case Study on NL2SQL

  1095. PIORF: Physics-Informed Ollivier-Ricci Flow for Long–Range Interactions in Mesh Graph Neural Networks

  1096. ACTIVE: Offline Reinforcement Learning via Adaptive Imitation and In-sample $V$-Ensemble

  1097. Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models Trained on Corrupted Data

  1098. Causal Effect Estimation with Mixed Latent Confounders and Post-treatment Variables

  1099. Generalized Video Moment Retrieval

  1100. How Much is a Noisy Image Worth? Data Scaling Laws for Ambient Diffusion.

  1101. Bidirectional Decoding: Improving Action Chunking via Guided Test-Time Sampling

  1102. Decoupled Finetuning for Domain Generalizable Semantic Segmentation

  1103. LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization

  1104. econSG: Efficient and Multi-view Consistent Open-Vocabulary 3D Semantic Gaussians

  1105. GravMAD: Grounded Spatial Value Maps Guided Action Diffusion for Generalized 3D Manipulation

  1106. Laplace Sample Information: Data Informativeness Through a Bayesian Lens

  1107. Systematic Relational Reasoning With Epistemic Graph Neural Networks

  1108. Improving Generalization and Robustness in SNNs Through Signed Rate Encoding and Sparse Encoding Attacks

  1109. CoMotion: Concurrent Multi-person 3D Motion

  1110. Generalization, Expressivity, and Universality of Graph Neural Networks on Attributed Graphs

  1111. MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

  1112. Do LLMs ``know'' internally when they follow instructions?

  1113. Multi-Perspective Data Augmentation for Few-shot Object Detection

  1114. Homomorphism Counts as Structural Encodings for Graph Learning

  1115. A new framework for evaluating model out-of-distribution generalisation for the biochemical domain

  1116. SFESS: Score Function Estimators for $k$-Subset Sampling

  1117. How many samples are needed to train a deep neural network?

  1118. Generalization Bounds and Model Complexity for Kolmogorov–Arnold Networks

  1119. HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

  1120. VOILA: Evaluation of MLLMs For Perceptual Understanding and Analogical Reasoning

  1121. $\gamma-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models

  1122. Forgetting Transformer: Softmax Attention with a Forget Gate

  1123. CofCA: A STEP-WISE Counterfactual Multi-hop QA benchmark

  1124. Understanding Matrix Function Normalizations in Covariance Pooling through the Lens of Riemannian Geometry

  1125. Rethinking Invariance in In-context Learning

  1126. Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing

  1127. On the Optimization Landscape of Low Rank Adaptation Methods for Large Language Models

  1128. G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model

  1129. AdaFisher: Adaptive Second Order Optimization via Fisher Information

  1130. Learning from Imperfect Human Feedback: A Tale from Corruption-Robust Dueling

  1131. mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models

  1132. UV-Attack: Physical-World Adversarial Attacks on Person Detection via Dynamic-NeRF-based UV Mapping

  1133. Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis

  1134. Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference

  1135. ESE: Espresso Sentence Embeddings

  1136. Tackling Data Corruption in Offline Reinforcement Learning via Sequence Modeling

  1137. Adaptive Shrinkage Estimation for Personalized Deep Kernel Regression in Modeling Brain Trajectories

  1138. Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics

  1139. Random-Set Neural Networks

  1140. SOREL: A Stochastic Algorithm for Spectral Risks Minimization

  1141. Encryption-Friendly LLM Architecture

  1142. TEOChat: A Large Vision-Language Assistant for Temporal Earth Observation Data

  1143. LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory

  1144. ICLR: In-Context Learning of Representations

  1145. Explanations of GNN on Evolving Graphs via Axiomatic Layer edges

  1146. Streamlining Prediction in Bayesian Deep Learning

  1147. Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models

  1148. TEASER: Token Enhanced Spatial Modeling for Expressions Reconstruction

  1149. Leveraging Flatness to Improve Information-Theoretic Generalization Bounds for SGD

  1150. Boltzmann priors for Implicit Transfer Operators

  1151. Diverse Preference Learning for Capabilities and Alignment

  1152. CONTRA: Conformal Prediction Region via Normalizing Flow Transformation

  1153. Bayesian WeakS-to-Strong from Text Classification to Generation

  1154. Multimodal Lego: Model Merging and Fine-Tuning Across Topologies and Modalities in Biomedicine

  1155. Distribution-Free Data Uncertainty for Neural Network Regression

  1156. Jump Your Steps: Optimizing Sampling Schedule of Discrete Diffusion Models

  1157. A Generalist Hanabi Agent

  1158. Rethinking Fair Representation Learning for Performance-Sensitive Tasks

  1159. Generative Flows on Synthetic Pathway for Drug Design

  1160. Revisit Micro-batch Clipping: Adaptive Data Pruning via Gradient Manipulation

  1161. FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models

  1162. XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

  1163. Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only

  1164. Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling

  1165. NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation

  1166. Content-Style Learning from Unaligned Domains: Identifiability under Unknown Latent Dimensions

  1167. A Benchmark for Semantic Sensitive Information in LLMs Outputs

  1168. Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language Models

  1169. Spreading Out-of-Distribution Detection on Graphs

  1170. Stealthy Shield Defense: A Conditional Mutual Information-Based Approach against Black-Box Model Inversion Attacks

  1171. Solving New Tasks by Adapting Internet Video Knowledge

  1172. Learning Causal Alignment for Reliable Disease Diagnosis

  1173. Facilitating Multi-turn Function Calling for LLMs via Compositional Instruction Tuning

  1174. GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning

  1175. Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models

  1176. Stable Segment Anything Model

  1177. Equivariant Denoisers Cannot Copy Graphs: Align Your Graph Diffusion Models

  1178. Achieving Dimension-Free Communication in Federated Learning via Zeroth-Order Optimization

  1179. Large (Vision) Language Models are Unsupervised In-Context Learners

  1180. ColPali: Efficient Document Retrieval with Vision Language Models

  1181. Conflict-Averse Gradient Aggregation for Constrained Multi-Objective Reinforcement Learning

  1182. SpinQuant: LLM Quantization with Learned Rotations

  1183. Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding

  1184. On the Byzantine-Resilience of Distillation-Based Federated Learning

  1185. OMG: Opacity Matters in Material Modeling with Gaussian Splatting

  1186. Counterfactual Generative Modeling with Variational Causal Inference

  1187. Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning

  1188. Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

  1189. SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes

  1190. Minimax Optimal Two-Stage Algorithm For Moment Estimation Under Covariate Shift

  1191. Support is All You Need for Certified VAE Training

  1192. Do Mice Grok? Glimpses of Hidden Progress in Sensory Cortex

  1193. Learning Graph Quantized Tokenizers

  1194. Dynamic Low-Rank Sparse Adaptation for Large Language Models

  1195. AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents

  1196. WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

  1197. Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG

  1198. Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs)

  1199. Track-On: Transformer-based Online Point Tracking with Memory

  1200. DreamDistribution: Learning Prompt Distribution for Diverse In-distribution Generation

  1201. Kernel-based Optimally Weighted Conformal Time-Series Prediction

  1202. Knowledge Graph Finetuning Enhances Knowledge Manipulation in Large Language Models

  1203. Adversarial Attacks on Data Attribution

  1204. Variance-Reducing Couplings for Random Features

  1205. FaceShot: Bring Any Character into Life

  1206. Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation

  1207. TIS-DPO: Token-level Importance Sampling for Direct Preference Optimization With Estimated Weights

  1208. Physics of Language Models: Part 3.2, Knowledge Manipulation

  1209. Analytic DAG Constraints for Differentiable DAG Learning

  1210. Generative Classifiers Avoid Shortcut Solutions

  1211. SimpleTM: A Simple Baseline for Multivariate Time Series Forecasting

  1212. Looking into User’s Long-term Interests through the Lens of Conservative Evidential Learning

  1213. Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

  1214. Strategic Classification With Externalities

  1215. EC-Diffuser: Multi-Object Manipulation via Entity-Centric Behavior Generation

  1216. ANaGRAM: A Natural Gradient Relative to Adapted Model for efficient PINNs learning

  1217. An Effective Manifold-based Optimization Method for Distributionally Robust Classification

  1218. ContextGNN: Beyond Two-Tower Recommendation Systems

  1219. Revisiting Source-Free Domain Adaptation: a New Perspective via Uncertainty Control

  1220. Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance

  1221. Neuron-based Multifractal Analysis of Neuron Interaction Dynamics in Large Models

  1222. DynFrs: An Efficient Framework for Machine Unlearning in Random Forest

  1223. Chunk-Distilled Language Modeling

  1224. Constraint-Conditioned Actor-Critic for Offline Safe Reinforcement Learning

  1225. Chain-of-Focus Prompting: Leveraging Sequential Visual Cues to Prompt Large Autoregressive Vision Models

  1226. Multi-Scale Fusion for Object Representation

  1227. MeToken: Uniform Micro-environment Token Boosts Post-Translational Modification Prediction

  1228. Neural Causal Graph for Interpretable and Intervenable Classification

  1229. NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields

  1230. Shifting the Paradigm: A Diffeomorphism Between Time Series Data Manifolds for Achieving Shift-Invariancy in Deep Learning

  1231. KinFormer: Generalizable Dynamical Symbolic Regression for Catalytic Organic Reaction Kinetics

  1232. The Unreasonable Ineffectiveness of the Deeper Layers

  1233. ACC-Collab: An Actor-Critic Approach to Multi-Agent LLM Collaboration

  1234. Visual Agents as Fast and Slow Thinkers

  1235. Tree-Wasserstein Distance for High Dimensional Data with a Latent Feature Hierarchy

  1236. InversionGNN: A Dual Path Network for Multi-Property Molecular Optimization

  1237. Graph Neural Networks Are More Than Filters: Revisiting and Benchmarking from A Spectral Perspective

  1238. Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers

  1239. CAMEx: Curvature-aware Merging of Experts

  1240. ONLINE EPSILON NET & PIERCING SET FOR GEOMETRIC CONCEPTS

  1241. ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler

  1242. Misspecified $Q$-Learning with Sparse Linear Function Approximation: Tight Bounds on Approximation Error

  1243. Scale-Free Graph-Language Models

  1244. YouTube-SL-25: A Large-Scale, Open-Domain Multilingual Sign Language Parallel Corpus

  1245. Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow

  1246. Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist

  1247. Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning

  1248. Causal Information Prioritization for Efficient Reinforcement Learning

  1249. HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics

  1250. Distilling Dataset into Neural Field

  1251. TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning

  1252. Balanced Neural ODEs: nonlinear model order reduction and Koopman operator approximations

  1253. Relax and Merge: A Simple Yet Effective Framework for Solving Fair $k$-Means and $k$-sparse Wasserstein Barycenter Problems

  1254. cryoSPHERE: Single-Particle HEterogeneous REconstruction from cryo EM

  1255. PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation

  1256. Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis

  1257. Towards Semantic Equivalence of Tokenization in Multimodal LLM

  1258. Linear combinations of latents in generative models: subspaces and beyond

  1259. Learning mirror maps in policy mirror descent

  1260. A Tight Convergence Analysis of Inexact Stochastic Proximal Point Algorithm for Stochastic Composite Optimization Problems

  1261. Shedding Light on Time Series Classification using Interpretability Gated Networks

  1262. Generalization Bounds for Canonicalization: A Comparative Study with Group Averaging

  1263. MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation

  1264. Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint

  1265. Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity

  1266. Atomas: Hierarchical Adaptive Alignment on Molecule-Text for Unified Molecule Understanding and Generation

  1267. From Layers to States: A State Space Model Perspective to Deep Neural Network Layer Dynamics

  1268. TIGeR: Unifying Text-to-Image Generation and Retrieval with Large Multimodal Models

  1269. Towards Federated RLHF with Aggregated Client Preference for LLMs

  1270. Subtask-Aware Visual Reward Learning from Segmented Demonstrations

  1271. From Isolated Conversations to Hierarchical Schemas: Dynamic Tree Memory Representation for LLMs

  1272. Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation

  1273. Quamba: A Post-Training Quantization Recipe for Selective State Space Models

  1274. Active Learning for Continual Learning: Keeping the Past Alive in the Present

  1275. Ready-to-React: Online Reaction Policy for Two-Character Interaction Generation

  1276. SimXRD-4M: Big Simulated X-ray Diffraction Data and Crystal Symmetry Classification Benchmark

  1277. Associative memory and dead neurons

  1278. Preble: Efficient Distributed Prompt Scheduling for LLM Serving

  1279. From Tokens to Lattices: Emergent Lattice Structures in Language Models

  1280. Adaptive Length Image Tokenization via Recurrent Allocation

  1281. HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing

  1282. Topological Zigzag Spaghetti for Diffusion-based Generation and Prediction on Graphs

  1283. E(3)-equivariant models cannot learn chirality: Field-based molecular generation

  1284. A General Framework for Off-Policy Learning with Partially-Observed Reward

  1285. Pursuing Feature Separation based on Neural Collapse for Out-of-Distribution Detection

  1286. CipherPrune: Efficient and Scalable Private Transformer Inference

  1287. OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data

  1288. Last Iterate Convergence of Incremental Methods as a Model of Forgetting

  1289. Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models

  1290. AgentSquare: Automatic LLM Agent Search in Modular Design Space

  1291. GS-CPR: Efficient Camera Pose Refinement via 3D Gaussian Splatting

  1292. DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing

  1293. Fragment and Geometry Aware Tokenization of Molecules for Structure-Based Drug Design Using Language Models

  1294. ParetoFlow: Guided Flows in Multi-Objective Optimization

  1295. GOFA: A Generative One-For-All Model for Joint Graph Language Modeling

  1296. Apollo-MILP: An Alternating Prediction-Correction Neural Solving Framework for Mixed-Integer Linear Programming

  1297. Diffusion Policy Policy Optimization

  1298. Image Watermarks are Removable using Controllable Regeneration from Clean Noise

  1299. Coreset Selection via Reducible Loss in Continual Learning

  1300. Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax

  1301. Is Your Video Language Model a Reliable Judge?

  1302. SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints

  1303. AdvPaint: Protecting Images from Inpainting Manipulation via Adversarial Attention Disruption

  1304. Bridging the Gap Between f-divergences and Bayes Hilbert Spaces

  1305. Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape View

  1306. An Engorgio Prompt Makes Large Language Model Babble on

  1307. CAT-3DGS: A Context-Adaptive Triplane Approach to Rate-Distortion-Optimized 3DGS Compression

  1308. Tracking objects that change in appearance with phase synchrony

  1309. CATCH: Channel-Aware Multivariate Time Series Anomaly Detection via Frequency Patching

  1310. Local Steps Speed Up Local GD for Heterogeneous Distributed Logistic Regression

  1311. Predicate Hierarchies Improve Few-Shot State Classification

  1312. Robust Conformal Prediction with a Single Binary Certificate

  1313. Is uniform expressivity too restrictive? Towards efficient expressivity of GNNs

  1314. SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs

  1315. Multi-Dimensional Conformal Prediction

  1316. From Promise to Practice: Realizing High-performance Decentralized Training

  1317. Causal Concept Graph Models: Beyond Causal Opacity in Deep Learning

  1318. Point-based Instance Completion with Scene Constraints

  1319. Hidden in the Noise: Two-Stage Robust Watermarking for Images

  1320. Unifying Causal Representation Learning with the Invariance Principle

  1321. Symbolic regression via MDLformer-guided search: from minimizing prediction error to minimizing description length

  1322. Discrete Codebook World Models for Continuous Control

  1323. ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities

  1324. Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning

  1325. CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding

  1326. Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning

  1327. Task Descriptors Help Transformers Learn Linear Models In-Context

  1328. Analysis of Linear Mode Connectivity via Permutation-Based Weight Matching: With Insights into Other Permutation Search Methods

  1329. Do as I do (Safely): Mitigating Task-Specific Fine-tuning Risks in Large Language Models

  1330. Enhancing Robust Fairness via Confusional Spectral Regularization

  1331. TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval

  1332. PT-T2I/V: An Efficient Proxy-Tokenized Diffusion Transformer for Text-to-Image/Video-Task

  1333. HMoRA: Making LLMs More Effective with Hierarchical Mixture of LoRA Experts

  1334. Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One Step

  1335. DCT-CryptoNets: Scaling Private Inference in the Frequency Domain

  1336. Model Editing as a Robust and Denoised variant of DPO: A Case Study on Toxicity

  1337. Efficient Top-m Data Values Identification for Data Selection

  1338. Differentially Private Steering for Large Language Model Alignment

  1339. Diversity-Rewarded CFG Distillation

  1340. Agent S: An Open Agentic Framework that Uses Computers Like a Human

  1341. Contractive Dynamical Imitation Policies for Efficient Out-of-Sample Recovery

  1342. Enhancing Cognition and Explainability of Multimodal Foundation Models with Self-Synthesized Data

  1343. Catastrophic Failure of LLM Unlearning via Quantization

  1344. RFMamba: Frequency-Aware State Space Model for RF-Based Human-Centric Perception

  1345. Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Explanations?

  1346. BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments

  1347. Rethinking Graph Neural Networks From A Geometric Perspective Of Node Features

  1348. Gaussian Mixture Counterfactual Generator

  1349. CarbonSense: A Multimodal Dataset and Baseline for Carbon Flux Modelling

  1350. Enhancing Graph Of Thought: Enhancing Prompts with LLM Rationales and Dynamic Temperature Control

  1351. HiBug2: Efficient and Interpretable Error Slice Discovery for Comprehensive Model Debugging

  1352. OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces

  1353. Robust Root Cause Diagnosis using In-Distribution Interventions

  1354. Taming Overconfidence in LLMs: Reward Calibration in RLHF

  1355. Prompt as Knowledge Bank: Boost Vision-language model via Structural Representation for zero-shot medical detection

  1356. Ensembles of Low-Rank Expert Adapters

  1357. Semi-Parametric Retrieval via Binary Bag-of-Tokens Index

  1358. FlashRNN: I/O-Aware Optimization of Traditional RNNs on modern hardware

  1359. One for all and all for one: Efficient computation of partial Wasserstein distances on the line

  1360. Collapsed Language Models Promote Fairness

  1361. InstaSHAP: Interpretable Additive Models Explain Shapley Values Instantly

  1362. INS: Interaction-aware Synthesis to Enhance Offline Multi-agent Reinforcement Learning

  1363. Finally Rank-Breaking Conquers MNL Bandits: Optimal and Efficient Algorithms for MNL Assortment

  1364. Dobi-SVD: Differentiable SVD for LLM Compression and Some New Perspectives

  1365. Provably Safeguarding a Classifier from OOD and Adversarial Samples

  1366. Transformer Block Coupling and its Correlation with Generalization in LLMs

  1367. ChemAgent: Self-updating Memories in Large Language Models Improves Chemical Reasoning

  1368. Addressing Label Shift in Distributed Learning via Entropy Regularization​

  1369. Circuit Transformer: A Transformer That Preserves Logical Equivalence

  1370. RMB: Comprehensively benchmarking reward models in LLM alignment

  1371. MA$^2$E: Addressing Partial Observability in Multi-Agent Reinforcement Learning with Masked Auto-Encoder

  1372. Convergent Privacy Loss of Noisy-SGD without Convexity and Smoothness

  1373. Reconsidering Faithfulness in Regular, Self-Explainable and Domain Invariant GNNs

  1374. BenTo: Benchmark Reduction with In-Context Transferability

  1375. Boosting the visual interpretability of CLIP via adversarial fine-tuning

  1376. Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats

  1377. No Equations Needed: Learning System Dynamics Without Relying on Closed-Form ODEs

  1378. Neural Wave Equation for Irregularly Sampled Sequence Data

  1379. Re-evaluating Open-ended Evaluation of Large Language Models

  1380. Slot-Guided Adaptation of Pre-trained Diffusion Models for Object-Centric Learning and Compositional Generation

  1381. PaPaGei: Open Foundation Models for Optical Physiological Signals

  1382. Diffusing States and Matching Scores: A New Framework for Imitation Learning

  1383. Ward: Provable RAG Dataset Inference via LLM Watermarks

  1384. Node-Time Conditional Prompt Learning in Dynamic Graphs

  1385. Safety Layers in Aligned Large Language Models: The Key to LLM Security

  1386. StringLLM: Understanding the String Processing Capability of Large Language Models

  1387. Open-Set Graph Anomaly Detection via Normal Structure Regularisation

  1388. ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning

  1389. Beyond single neurons: population response geometry in digital twins of mouse visual cortex

  1390. Diffusion State-Guided Projected Gradient for Inverse Problems

  1391. AstroCompress: A benchmark dataset for multi-purpose compression of astronomical data

  1392. LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

  1393. CHAMP: Conformalized 3D Human Multi-Hypothesis Pose Estimators

  1394. DeeperForward: Enhanced Forward-Forward Training for Deeper and Better Performance

  1395. Air Quality Prediction with Physics-Guided Dual Neural ODEs in Open Systems

  1396. Semantic Loss Guided Data Efficient Supervised Fine Tuning for Safe Responses in LLMs

  1397. Unearthing Skill-level Insights for Understanding Trade-offs of Foundation Models

  1398. PolaFormer: Polarity-aware Linear Attention for Vision Transformers

  1399. Unveiling the Magic of Code Reasoning through Hypothesis Decomposition and Amendment

  1400. ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents

  1401. Learning Long Range Dependencies on Graphs via Random Walks

  1402. MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?

  1403. Predictive Uncertainty Quantification for Bird's Eye View Segmentation: A Benchmark and Novel Loss Function

  1404. RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data

  1405. Efficient Dictionary Learning with Switch Sparse Autoencoders

  1406. Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation

  1407. DUET: Decentralized Bilevel Optimization without Lower-Level Strong Convexity

  1408. Trained Transformer Classifiers Generalize and Exhibit Benign Overfitting In-Context

  1409. Personalized Representation from Personalized Generation

  1410. CURIE: Evaluating LLMs on Multitask Scientific Long-Context Understanding and Reasoning

  1411. Neural Phylogeny: Fine-Tuning Relationship Detection among Neural Networks

  1412. Indirect Gradient Matching for Adversarial Robust Distillation

  1413. CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models

  1414. Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration

  1415. Hotspot-Driven Peptide Design via Multi-Fragment Autoregressive Extension

  1416. Learning a Neural Solver for Parametric PDEs to Enhance Physics-Informed Methods

  1417. Quantifying Generalization Complexity for Large Language Models

  1418. To Clip or not to Clip: the Dynamics of SGD with Gradient Clipping in High-Dimensions

  1419. The OMG dataset: An Open MetaGenomic corpus for mixed-modality genomic language modeling

  1420. An Undetectable Watermark for Generative Image Models

  1421. Long-Sequence Recommendation Models Need Decoupled Embeddings

  1422. OmnixR: Evaluating Omni-modality Language Models on Reasoning across Modalities

  1423. U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models

  1424. Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance

  1425. The AdEMAMix Optimizer: Better, Faster, Older

  1426. SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters

  1427. VTDexManip: A Dataset and Benchmark for Visual-tactile Pretraining and Dexterous Manipulation with Reinforcement Learning

  1428. AssembleFlow: Rigid Flow Matching with Inertial Frames for Molecular Assembly

  1429. Extending Mercer's expansion to indefinite and asymmetric kernels

  1430. Learning Interleaved Image-Text Comprehension in Vision-Language Large Models

  1431. Large Language Models can Become Strong Self-Detoxifiers

  1432. UniDrive: Towards Universal Driving Perception Across Camera Configurations

  1433. HeadMap: Locating and Enhancing Knowledge Circuits in LLMs

  1434. AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models

  1435. Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens

  1436. Positive-Unlabeled Diffusion Models for Preventing Sensitive Data Generation

  1437. On the Convergence of No-Regret Dynamics in Information Retrieval Games with Proportional Ranking Functions

  1438. ComLoRA: A Competitive Learning Approach for Enhancing LoRA

  1439. BLEND: Behavior-guided Neural Population Dynamics Modeling via Privileged Knowledge Distillation

  1440. Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting

  1441. EFFICIENT JAILBREAK ATTACK SEQUENCES ON LARGE LANGUAGE MODELS VIA MULTI-ARMED BANDIT-BASED CONTEXT SWITCHING

  1442. Competitive Fair Scheduling with Predictions

  1443. ZETA: Leveraging $Z$-order Curves for Efficient Top-$k$ Attention

  1444. Attribute-based Visual Reprogramming for Vision-Language Models

  1445. Minimalistic Predictions for Online Class Constraint Scheduling

  1446. Diffusion-based Neural Network Weights Generation

  1447. OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation

  1448. Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering

  1449. Optimizing importance weighting in the presence of sub-population shifts

  1450. Temporal Difference Learning: Why It Can Be Fast and How It Will Be Faster

  1451. Scaling Diffusion Language Models via Adaptation from Autoregressive Models

  1452. WeatherGFM: Learning a Weather Generalist Foundation Model via In-context Learning

  1453. Noisy Test-Time Adaptation in Vision-Language Models

  1454. CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and Reranking

  1455. Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting

  1456. Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks

  1457. Action abstractions for amortized sampling

  1458. Risk-Sensitive Variational Actor-Critic: A Model-Based Approach

  1459. Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs

  1460. Connecting Federated ADMM to Bayes

  1461. Proactive Privacy Amnesia for Large Language Models: Safeguarding PII with Negligible Impact on Model Utility

  1462. M^3PC: Test-time Model Predictive Control using Pretrained Masked Trajectory Model

  1463. Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data

  1464. Sparse autoencoders reveal selective remapping of visual concepts during adaptation

  1465. AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents

  1466. The Value of Sensory Information to a Robot

  1467. Representational Similarity via Interpretable Visual Concepts

  1468. SafeDiffuser: Safe Planning with Diffusion Probabilistic Models

  1469. Fast and Accurate Blind Flexible Docking

  1470. Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences

  1471. Uncertainty and Influence aware Reward Model Refinement for Reinforcement Learning from Human Feedback

  1472. THE ROBUSTNESS OF DIFFERENTIABLE CAUSAL DISCOVERY IN MISSPECIFIED SCENARIOS

  1473. Uncertainty modeling for fine-tuned implicit functions

  1474. Language models scale reliably with over-training and on downstream tasks

  1475. PaCA: Partial Connection Adaptation for Efficient Fine-Tuning

  1476. End-to-end Learning of Gaussian Mixture Priors for Diffusion Sampler

  1477. Mind the GAP: Glimpse-based Active Perception improves generalization and sample efficiency of visual reasoning

  1478. REMEDY: Recipe Merging Dynamics in Large Vision-Language Models

  1479. Selective Aggregation for Low-Rank Adaptation in Federated Learning

  1480. DeciMamba: Exploring the Length Extrapolation Potential of Mamba

  1481. LLaRA: Supercharging Robot Learning Data for Vision-Language Policy

  1482. Can We Talk Models Into Seeing the World Differently?

  1483. ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance

  1484. Utility-Directed Conformal Prediction: A Decision-Aware Framework for Actionable Uncertainty Quantification

  1485. Fast Summation of Radial Kernels via QMC Slicing

  1486. Stochastic variance-reduced Gaussian variational inference on the Bures-Wasserstein manifold

  1487. Why In-Context Learning Models are Good Few-Shot Learners?

  1488. 3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting

  1489. Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models

  1490. Duoduo CLIP: Efficient 3D Understanding with Multi-View Images

  1491. DenseGrounding: Improving Dense Language-Vision Semantics for Ego-centric 3D Visual Grounding

  1492. Provably Robust Explainable Graph Neural Networks against Graph Perturbation Attacks

  1493. Near-optimal Active Regression of Single-Index Models

  1494. The Optimization Landscape of SGD Across the Feature Learning Strength

  1495. Spatial-Mamba: Effective Visual State Space Models via Structure-Aware State Fusion

  1496. EcoFace: Audio-Visual Emotional Co-Disentanglement Speech-Driven 3D Talking Face Generation

  1497. Adam-mini: Use Fewer Learning Rates To Gain More

  1498. Improving Graph Neural Networks by Learning Continuous Edge Directions

  1499. MuPT: A Generative Symbolic Music Pretrained Transformer

  1500. Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs

  1501. Minimax Optimal Reinforcement Learning with Quasi-Optimism

  1502. Interpreting Language Reward Models via Contrastive Explanations

  1503. Lipschitz Bandits in Optimal Space

  1504. Bootstrapped Model Predictive Control

  1505. Disentangling 3D Animal Pose Dynamics with Scrubbed Conditional Latent Variables

  1506. Simple Guidance Mechanisms for Discrete Diffusion Models

  1507. MM-EMBED: UNIVERSAL MULTIMODAL RETRIEVAL WITH MULTIMODAL LLMS

  1508. Language Guided Skill Discovery

  1509. Valid Conformal Prediction for Dynamic GNNs

  1510. ECHOPulse: ECG Controlled Echocardio-gram Video Generation

  1511. SymDiff: Equivariant Diffusion via Stochastic Symmetrisation

  1512. Parameter and Memory Efficient Pretraining via Low-rank Riemannian Optimization

  1513. Deep Distributed Optimization for Large-Scale Quadratic Programming

  1514. Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification

  1515. Neural Stochastic Differential Equations for Uncertainty-Aware Offline RL

  1516. DPaI: Differentiable Pruning at Initialization with Node-Path Balance Principle

  1517. Improving Pretraining Data Using Perplexity Correlations

  1518. Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond

  1519. GSBA$^K$: $top$-$K$ Geometric Score-based Black-box Attack

  1520. Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning

  1521. Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs

  1522. TDDBench: A Benchmark for Training data detection

  1523. AutoG: Towards automatic graph construction from tabular data

  1524. DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory

  1525. Mind Control through Causal Inference: Predicting Clean Images from Poisoned Data

  1526. Can Knowledge Editing Really Correct Hallucinations?

  1527. OvercookedV2: Rethinking Overcooked for Zero-Shot Coordination

  1528. Adversarial Search Engine Optimization for Large Language Models

  1529. Causal Representation Learning from Multimodal Biomedical Observations

  1530. Towards Robust Multimodal Open-set Test-time Adaptation via Adaptive Entropy-aware Optimization

  1531. Mixture Compressor for Mixture-of-Experts LLMs Gains More

  1532. Efficient Exploration and Discriminative World Model Learning with an Object-Centric Abstraction

  1533. Watch Less, Do More: Implicit Skill Discovery for Video-Conditioned Policy

  1534. SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation

  1535. Causal Graphical Models for Vision-Language Compositional Understanding

  1536. Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks

  1537. Immunogenicity Prediction with Dual Attention Enables Vaccine Target Selection

  1538. Privately Counting Partially Ordered Data

  1539. Mining your own secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models

  1540. Large Language Models are Interpretable Learners

  1541. No Location Left Behind: Measuring and Improving the Fairness of Implicit Representations for Earth Data

  1542. Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization

  1543. DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors

  1544. Uncertainty-Aware Decoding with Minimum Bayes Risk

  1545. GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation

  1546. Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration

  1547. PWM: Policy Learning with Multi-Task World Models

  1548. SAGEPhos: Sage Bio-Coupled and Augmented Fusion for Phosphorylation Site Detection

  1549. OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones?

  1550. NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics

  1551. GI-GS: Global Illumination Decomposition on Gaussian Splatting for Inverse Rendering

  1552. Real-Time Video Generation with Pyramid Attention Broadcast

  1553. HAMSTER: Hierarchical Action Models for Open-World Robot Manipulation

  1554. Improving Complex Reasoning with Dynamic Prompt Corruption: A Soft Prompt Optimization Approach

  1555. Sharper Guarantees for Learning Neural Network Classifiers with Gradient Methods

  1556. Sharpness-Aware Black-Box Optimization

  1557. Model Risk-sensitive Offline Reinforcement Learning

  1558. BANGS: Game-theoretic Node Selection for Graph Self-Training

  1559. RNNs are not Transformers (Yet): The Key Bottleneck on In-Context Retrieval

  1560. Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment

  1561. Sensitivity Verification for Additive Decision Tree Ensembles

  1562. An Auditing Test to Detect Behavioral Shift in Language Models

  1563. Rethinking the role of frames for SE(3)-invariant crystal structure modeling

  1564. Learning to Select Nodes in Branch and Bound with Sufficient Tree Representation

  1565. PortLLM: Personalizing Evolving Large Language Models with Training-Free and Portable Model Patches

  1566. T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data

  1567. Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

  1568. Beyond Surface Structure: A Causal Assessment of LLMs' Comprehension ability

  1569. Unify ML4TSP: Drawing Methodological Principles for TSP and Beyond from Streamlined Design Space of Learning and Search

  1570. Minimal Variance Model Aggregation: A principled, non-intrusive, and versatile integration of black box models

  1571. Rethinking Visual Counterfactual Explanations Through Region Constraint

  1572. A Conditional Independence Test in the Presence of Discretization

  1573. Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression

  1574. ClimaQA: An Automated Evaluation Framework for Climate Question Answering Models

  1575. Directional Gradient Projection for Robust Fine-Tuning of Foundation Models

  1576. SCBench: A KV Cache-Centric Analysis of Long-Context Methods

  1577. Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models

  1578. GraphBridge: Towards Arbitrary Transfer Learning in GNNs

  1579. Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image Generation

  1580. Strategist: Self-improvement of LLM Decision Making via Bi-Level Tree Search

  1581. Precise Parameter Localization for Textual Generation in Diffusion Models

  1582. LIFe-GoM: Generalizable Human Rendering with Learned Iterative Feedback Over Multi-Resolution Gaussians-on-Mesh

  1583. MAI: A Multi-turn Aggregation-Iteration Model for Composed Image Retrieval

  1584. A Truncated Newton Method for Optimal Transport

  1585. Depth Any Video with Scalable Synthetic Data

  1586. Residual-MPPI: Online Policy Customization for Continuous Control

  1587. Efficient Biological Data Acquisition through Inference Set Design

  1588. Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining

  1589. RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction

  1590. Rare event modeling with self-regularized normalizing flows: what can we learn from a single failure?

  1591. DocMIA: Document-Level Membership Inference Attacks against DocVQA Models

  1592. Scalable Influence and Fact Tracing for Large Language Model Pretraining

  1593. Towards Auto-Regressive Next-Token Prediction: In-context Learning Emerges from Generalization

  1594. Safety Representations for Safer Policy Learning

  1595. Integrative Decoding: Improving Factuality via Implicit Self-consistency

  1596. CrossMPT: Cross-attention Message-passing Transformer for Error Correcting Codes

  1597. Exploring channel distinguishability in local neighborhoods of the model space in quantum neural networks

  1598. Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark

  1599. GNNs Getting ComFy: Community and Feature Similarity Guided Rewiring

  1600. Local Loss Optimization in the Infinite Width: Stable Parameterization of Predictive Coding Networks and Target Propagation

  1601. Beyond Interpretability: The Gains of Feature Monosemanticity on Model Robustness

  1602. Multi-domain Distribution Learning for De Novo Drug Design

  1603. Black Sheep in the Herd: Playing with Spuriously Correlated Attributes for Vision-Language Recognition

  1604. ADBM: Adversarial Diffusion Bridge Model for Reliable Adversarial Purification

  1605. Eliminating Position Bias of Language Models: A Mechanistic Approach

  1606. CL-MFAP: A Contrastive Learning-Based Multimodal Foundation Model for Molecular Property Prediction and Antibiotic Screening

  1607. A Unified Theory of Quantum Neural Network Loss Landscapes

  1608. Transformer Learns Optimal Variable Selection in Group-Sparse Classification

  1609. Robust-PIFu: Robust Pixel-aligned Implicit Function for 3D Human Digitalization from a Single Image

  1610. Fantastic Copyrighted Beasts and How (Not) to Generate Them

  1611. Gramian Multimodal Representation Learning and Alignment

  1612. ADePT: Adaptive Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning

  1613. As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss

  1614. OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling

  1615. FIG: Flow with Interpolant Guidance for Linear Inverse Problems

  1616. Block Verification Accelerates Speculative Decoding

  1617. Weighted-Reward Preference Optimization for Implicit Model Fusion

  1618. Simple ReFlow: Improved Techniques for Fast Flow Models

  1619. BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games

  1620. UGMathBench: A Diverse and Dynamic Benchmark for Undergraduate-Level Mathematical Reasoning with Large Language Models

  1621. Causal Graph Transformer for Treatment Effect Estimation Under Unknown Interference

  1622. Efficient Imitation under Misspecification

  1623. Learned Reference-based Diffusion Sampler for multi-modal distributions

  1624. Exploring The Forgetting in Adversarial Training: A Novel Method for Enhancing Robustness

  1625. CS-Bench: A Comprehensive Benchmark for Large Language Models towards Computer Science Mastery

  1626. Scalable Bayesian Learning with posteriors

  1627. $\phi$-Update: A Class of Policy Update Methods with Policy Convergence Guarantee

  1628. Teaching Human Behavior Improves Content Understanding Abilities Of VLMs

  1629. Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Video and Multi-view Diffusion Models

  1630. Enhancing End-to-End Autonomous Driving with Latent World Model

  1631. Optimality of Matrix Mechanism on $\ell_p^p$-metric

  1632. Can We Ignore Labels in Out of Distribution Detection?

  1633. Statistical Advantages of Perturbing Cosine Router in Mixture of Experts

  1634. Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization

  1635. Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference

  1636. On Evaluating the Durability of Safeguards for Open-Weight LLMs

  1637. GReaTer: Gradients Over Reasoning Makes Smaller Language Models Strong Prompt Optimizers

  1638. Looking Backward: Retrospective Backward Synthesis for Goal-Conditioned GFlowNets

  1639. When LLMs Play the Telephone Game: Cultural Attractors as Conceptual Tools to Evaluate LLMs in Multi-turn Settings

  1640. LongMamba: Enhancing Mamba's Long-Context Capabilities via Training-Free Receptive Field Enlargement

  1641. Unlearning or Obfuscating? Jogging the Memory of Unlearned LLMs via Benign Relearning

  1642. What is Wrong with Perplexity for Long-context Language Modeling?

  1643. Recovery of Causal Graph Involving Latent Variables via Homologous Surrogates

  1644. OpenPRM: Building Open-domain Process-based Reward Models with Preference Trees

  1645. Efficient Discovery of Pareto Front for Multi-Objective Reinforcement Learning

  1646. TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models

  1647. Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs

  1648. Controllable Satellite-to-Street-View Synthesis with Precise Pose Alignment and Zero-Shot Environmental Control

  1649. MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

  1650. API Pack: A Massive Multi-Programming Language Dataset for API Call Generation

  1651. Capability Localization: Capabilities Can be Localized rather than Individual Knowledge

  1652. Federated Continual Learning Goes Online: Uncertainty-Aware Memory Management for Vision Tasks and Beyond

  1653. Lasso Bandit with Compatibility Condition on Optimal Arm

  1654. Safety-Prioritizing Curricula for Constrained Reinforcement Learning

  1655. Action Sequence Augmentation for Action Anticipation

  1656. The Rise and Down of Babel Tower: Investigating the Evolution Process of Multilingual Code Large Language Model

  1657. Parameter Expanded Stochastic Gradient Markov Chain Monte Carlo

  1658. What Secrets Do Your Manifolds Hold? Understanding the Local Geometry of Generative Models

  1659. TD-Paint: Faster Diffusion Inpainting Through Time-Aware Pixel Conditioning

  1660. Why Does the Effective Context Length of LLMs Fall Short?

  1661. Vision CNNs trained to estimate spatial latents learned similar ventral-stream-aligned representations

  1662. Value-aligned Behavior Cloning for Offline Reinforcement Learning via Bi-level Optimization

  1663. Persistent Pre-training Poisoning of LLMs

  1664. BadRobot: Jailbreaking Embodied LLM Agents in the Physical World

  1665. Disentangled Representation Learning with the Gromov-Monge Gap

  1666. Diffusion Bridge Implicit Models

  1667. Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning

  1668. A Meta-Learning Approach to Bayesian Causal Discovery

  1669. TweedieMix: Improving Multi-Concept Fusion for Diffusion-based Image/Video Generation

  1670. Learning Neural Networks with Distribution Shift: Efficiently Certifiable Guarantees

  1671. Filtered not Mixed: Filtering-Based Online Gating for Mixture of Large Language Models

  1672. Looking Inward: Language Models Can Learn About Themselves by Introspection

  1673. DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation

  1674. Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning

  1675. Offline RL with Smooth OOD Generalization in Convex Hull and its Neighborhood

  1676. How Does Vision-Language Adaptation Impact the Safety of Vision Language Models?

  1677. Multi-Resolution Decomposable Diffusion Model for Non-Stationary Time Series Anomaly Detection

  1678. Layerwise Recurrent Router for Mixture-of-Experts

  1679. On Minimizing Adversarial Counterfactual Error in Adversarial Reinforcement Learning

  1680. GraphRouter: A Graph-based Router for LLM Selections

  1681. Rethinking Audio-Visual Adversarial Vulnerability from Temporal and Modality Perspectives

  1682. Continuous Ensemble Weather Forecasting with Diffusion models

  1683. Generating Physical Dynamics under Priors

  1684. DataMan: Data Manager for Pre-training Large Language Models

  1685. Preserving Deep Representations in One-Shot Pruning: A Hessian-Free Second-Order Optimization Framework

  1686. UniDetox: Universal Detoxification of Large Language Models via Dataset Distillation

  1687. Arithmetic Transformers Can Length-Generalize in Both Operand Length and Count

  1688. Look Before You Leap: Universal Emergent Mechanism for Retrieval in Language Models

  1689. Matrix Product Sketching via Coordinated Sampling

  1690. 3D-MolT5: Leveraging Discrete Structural Information for Molecule-Text Modeling

  1691. Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small LLMs

  1692. Learning Clustering-based Prototypes for Compositional Zero-Shot Learning

  1693. BadJudge: Backdoor Vulnerabilities of LLM-As-A-Judge

  1694. Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling

  1695. Pairwise Elimination with Instance-Dependent Guarantees for Bandits with Cost Subsidy

  1696. Improved Techniques for Optimization-Based Jailbreaking on Large Language Models

  1697. Beyond Worst-Case Dimensionality Reduction for Sparse Vectors

  1698. PAD: Personalized Alignment of LLMs at Decoding-time

  1699. Dreamweaver: Learning Compositional World Models from Pixels

  1700. Policy Decorator: Model-Agnostic Online Refinement for Large Policy Model

  1701. Ensembling Diffusion Models via Adaptive Feature Aggregation

  1702. Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models

  1703. Efficient Reinforcement Learning with Large Language Model Priors

  1704. Federated Residual Low-Rank Adaptation of Large Language Models

  1705. Convex Formulations for Training Two-Layer ReLU Neural Networks

  1706. RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph

  1707. When Graph Neural Networks Meet Dynamic Mode Decomposition

  1708. TC-MoE: Augmenting Mixture of Experts with Ternary Expert Choice

  1709. DeepTAGE: Deep Temporal-Aligned Gradient Enhancement for Optimizing Spiking Neural Networks

  1710. Learning Diagrams: A Graphical Language for Compositional Training Regimes

  1711. Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace

  1712. Aligned Datasets Improve Detection of Latent Diffusion-Generated Images

  1713. Progressive Token Length Scaling in Transformer Encoders for Efficient Universal Segmentation

  1714. Learning Structured Universe Graph with Outlier OOD Detection for Partial Matching

  1715. FACTS: A Factored State-Space Framework for World Modelling

  1716. Bootstrapping Language Models with DPO Implicit Rewards

  1717. Gaussian Splatting Lucas-Kanade

  1718. Improving Deep Regression with Tightness

  1719. Fine-Tuning Attention Modules Only: Enhancing Weight Disentanglement in Task Arithmetic

  1720. Reasoning with Latent Thoughts: On the Power of Looped Transformers

  1721. Provable unlearning in topic modeling and downstream tasks

  1722. Transformer-Squared: Self-adaptive LLMs

  1723. Quantum-PEFT: Ultra parameter-efficient fine-tuning

  1724. FOSP: Fine-tuning Offline Safe Policy through World Models

  1725. Dynamic Sparse Training versus Dense Training: The Unexpected Winner in Image Corruption Robustness

  1726. Gaussian-Based Instance-Adaptive Intensity Modeling for Point-Supervised Facial Expression Spotting

  1727. Large Scale Knowledge Washing

  1728. ImProver: Agent-Based Automated Proof Optimization

  1729. SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation

  1730. Offline Hierarchical Reinforcement Learning via Inverse Optimization

  1731. FreeVS: Generative View Synthesis on Free Driving Trajectory

  1732. Training Free Exponential Context Extension via Cascading KV Cache

  1733. Tool-Planner: Task Planning with Clusters across Multiple Tools

  1734. Generalizable Human Gaussians from Single-View Image

  1735. Calibrating Expressions of Certainty

  1736. MOFFlow: Flow Matching for Structure Prediction of Metal-Organic Frameworks

  1737. Sequential Controlled Langevin Diffusions

  1738. MixMax: Distributional Robustness in Function Space via Optimal Data Mixtures

  1739. Locality-aware Gaussian Compression for Fast and High-quality Rendering

  1740. SyllableLM: Learning Coarse Semantic Units for Speech Language Models

  1741. PooDLe🐩: Pooled and dense self-supervised learning from naturalistic videos

  1742. In-context Time Series Predictor

  1743. CodePlan: Unlocking Reasoning Potential in Large Language Models by Scaling Code-form Planning

  1744. Large Language Models Assume People are More Rational than We Really are

  1745. DELTA: DENSE EFFICIENT LONG-RANGE 3D TRACKING FOR ANY VIDEO

  1746. Policy Design in Long-run Welfare Dynamics

  1747. MallowsPO: Fine-Tune Your LLM with Preference Dispersions

  1748. Morphing Tokens Draw Strong Masked Image Models

  1749. Hybrid Regularization Improves Diffusion-based Inverse Problem Solving

  1750. Not All Language Model Features Are One-Dimensionally Linear

  1751. CLIBD: Bridging Vision and Genomics for Biodiversity Monitoring at Scale

  1752. GSE: Group-wise Sparse and Explainable Adversarial Attacks

  1753. Efficient Training of Neural Stochastic Differential Equations by Matching Finite Dimensional Distributions

  1754. Denoising with a Joint-Embedding Predictive Architecture

  1755. Controlling Space and Time with Diffusion Models

  1756. Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs

  1757. MotionDreamer: One-to-Many Motion Synthesis with Localized Generative Masked Transformer

  1758. Gyrogroup Batch Normalization

  1759. Identifying latent state transitions in non-linear dynamical systems

  1760. Exploiting Distribution Constraints for Scalable and Efficient Image Retrieval

  1761. Overcoming Lower-Level Constraints in Bilevel Optimization: A Novel Approach with Regularized Gap Functions

  1762. Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions

  1763. Continuous Autoregressive Modeling with Stochastic Monotonic Alignment for Speech Synthesis

  1764. VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning

  1765. Towards Generalization Bounds of GCNs for Adversarially Robust Node Classification

  1766. Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference

  1767. Occlusion-aware Non-Rigid Point Cloud Registration via Unsupervised Neural Deformation Correntropy

  1768. LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

  1769. Towards Interpreting Visual Information Processing in Vision-Language Models

  1770. On-the-fly Preference Alignment via Principle-Guided Decoding

  1771. GeoILP: A Synthetic Dataset to Guide Large-Scale Rule Induction

  1772. Certified Robustness Under Bounded Levenshtein Distance

  1773. How to Evaluate Reward Models for RLHF

  1774. Flash Inference: Near Linear Time Inference for Long Convolution Sequence Models and Beyond

  1775. From Risk to Uncertainty: Generating Predictive Uncertainty Measures via Bayesian Estimation

  1776. Efficient Learning with Sine-Activated Low-Rank Matrices

  1777. Towards Homogeneous Lexical Tone Decoding from Heterogeneous Intracranial Recordings

  1778. Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF

  1779. Tight Time Complexities in Parallel Stochastic Optimization with Arbitrary Computation Dynamics

  1780. Accelerating Training with Neuron Interaction and Nowcasting Networks

  1781. CViT: Continuous Vision Transformer for Operator Learning

  1782. eQMARL: Entangled Quantum Multi-Agent Reinforcement Learning for Distributed Cooperation over Quantum Channels

  1783. CertainlyUncertain: A Benchmark and Metric for Multimodal Epistemic and Aleatoric Awareness

  1784. Bridging Context Gaps: Leveraging Coreference Resolution for Long Contextual Understanding

  1785. ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

  1786. Adversarial Latent Feature Augmentation for Fairness

  1787. Find A Winning Sign: Sign Is All We Need to Win the Lottery

  1788. Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents

  1789. Distributed Speculative Inference (DSI): Speculation Parallelism for Provably Faster Lossless Language Model Inference

  1790. Fine-Grained Verifiers: Preference Modeling as Next-token Prediction in Vision-Language Alignment

  1791. REvolve: Reward Evolution with Large Language Models using Human Feedback

  1792. DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

  1793. Interaction Asymmetry: A General Principle for Learning Composable Abstractions

  1794. CoInD: Enabling Logical Compositions in Diffusion Models

  1795. Beyond FVD: An Enhanced Evaluation Metrics for Video Generation Distribution Quality

  1796. Foundation Models Secretly Understand Neural Network Weights: Enhancing Hypernetwork Architectures with Foundation Models

  1797. Diff-PIC: Revolutionizing Particle-In-Cell Nuclear Fusion Simulation with Diffusion Models

  1798. A New Perspective on Shampoo's Preconditioner

  1799. Dataset Distillation via Knowledge Distillation: Towards Efficient Self-Supervised Pre-training of Deep Networks

  1800. Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models

  1801. On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality

  1802. Out-of-distribution Generalization for Total Variation based Invariant Risk Minimization

  1803. Are Large Vision Language Models Good Game Players?

  1804. Swift4D: Adaptive divide-and-conquer Gaussian Splatting for compact and efficient reconstruction of dynamic scene

  1805. $\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs

  1806. A Large-scale Training Paradigm for Graph Generative Models

  1807. ReMatching Dynamic Reconstruction Flow

  1808. Neural networks on Symmetric Spaces of Noncompact Type

  1809. Repulsive Latent Score Distillation for Solving Inverse Problems

  1810. Breaking the $\log(1/\Delta_2)$ Barrier: Better Batched Best Arm Identification with Adaptive Grids

  1811. Test-time Adaptation for Image Compression with Distribution Regularization

  1812. GeoLoRA: Geometric integration for parameter efficient fine-tuning

  1813. ImDy: Human Inverse Dynamics from Imitated Observations

  1814. MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation

  1815. LoRanPAC: Low-rank Random Features and Pre-trained Models for Bridging Theory and Practice in Continual Learning

  1816. Multimodal Unsupervised Domain Generalization by Retrieving Across the Modality Gap

  1817. SOO-Bench: Benchmarks for Evaluating the Stability of Offline Black-Box Optimization

  1818. Selective Induction Heads: How Transformers Select Causal Structures in Context

  1819. PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs

  1820. Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces

  1821. CycleResearcher: Improving Automated Research via Automated Review

  1822. Diffusion Transformers for Tabular Data Time Series Generation

  1823. No Preference Left Behind: Group Distributional Preference Optimization

  1824. Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want

  1825. Perturbation-Restrained Sequential Model Editing

  1826. Making Transformer Decoders Better Differentiable Indexers

  1827. Can Generative AI Solve Your In-Context Learning Problem? A Martingale Perspective

  1828. Generative Adapter: Contextualizing Language Models in Parameters with A Single Forward Pass

  1829. Human-Aligned Chess With a Bit of Search

  1830. Dysca: A Dynamic and Scalable Benchmark for Evaluating Perception Ability of LVLMs

  1831. On the Transfer of Object-Centric Representation Learning

  1832. Prototype antithesis for biological few-shot class-incremental learning

  1833. CoMRes: Semi-Supervised Time Series Forecasting Utilizing Consensus Promotion of Multi-Resolution

  1834. Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data

  1835. Bridging the Gap between Variational Inference and Stochastic Gradient MCMC in Function Space

  1836. Towards Generalizable Reinforcement Learning via Causality-Guided Self-Adaptive Representations

  1837. Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

  1838. Gradient descent with generalized Newton’s method

  1839. MeshMask: Physics-Based Simulations with Masked Graph Neural Networks

  1840. Second-Order Fine-Tuning without Pain for LLMs: A Hessian Informed Zeroth-Order Optimizer

  1841. Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement Learning

  1842. NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative

  1843. $\text{I}^2\text{AM}$: Interpreting Image-to-Image Latent Diffusion Models via Bi-Attribution Maps

  1844. A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning

  1845. Mini-batch Coresets for Memory-efficient Language Model Training on Data Mixtures

  1846. The Case for Cleaner Biosignals: High-fidelity Neural Compressor Enables Transfer from Cleaner iEEG to Noisier EEG

  1847. X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing

  1848. No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models

  1849. Rethinking Shapley Value for Negative Interactions in Non-convex Games

  1850. Adapting Multi-modal Large Language Model to Concept Drift From Pre-training Onwards

  1851. BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks

  1852. TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies

  1853. DeepGate4: Efficient and Effective Representation Learning for Circuit Design at Scale

  1854. Risk-Sensitive Diffusion: Robustly Optimizing Diffusion Models with Noisy Samples

  1855. AgentStudio: A Toolkit for Building General Virtual Agents

  1856. Inner Information Analysis Algorithm for Deep Neural Network based on Community

  1857. Efficient Evolutionary Search Over Chemical Space with Large Language Models

  1858. Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

  1859. RecDreamer: Consistent Text-to-3D Generation via Uniform Score Distillation

  1860. KAA: Kolmogorov-Arnold Attention for Enhancing Attentive Graph Neural Networks

  1861. Understanding and Enhancing the Transferability of Jailbreaking Attacks

  1862. Is Factuality Enhancement a Free Lunch For LLMs? Better Factuality Can Lead to Worse Context-Faithfulness

  1863. Robust Representation Consistency Model via Contrastive Denoising

  1864. Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws

  1865. Towards Multiple Character Image Animation Through Enhancing Implicit Decoupling

  1866. Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models

  1867. Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement

  1868. Glimpse: Enabling White-Box Methods to Use Proprietary Models for Zero-Shot LLM-Generated Text Detection

  1869. HADAMRNN: BINARY AND SPARSE TERNARY ORTHOGONAL RNNS

  1870. Denoising Autoregressive Transformers for Scalable Text-to-Image Generation

  1871. Learning Geometric Reasoning Networks For Robot Task And Motion Planning

  1872. DexTrack: Towards Generalizable Neural Tracking Control for Dexterous Manipulation from Human References

  1873. MMDisCo: Multi-Modal Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation

  1874. VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models

  1875. IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking

  1876. MotionClone: Training-Free Motion Cloning for Controllable Video Generation

  1877. InCoDe: Interpretable Compressed Descriptions For Image Generation

  1878. Standardizing Structural Causal Models

  1879. Training Neural Networks as Recognizers of Formal Languages

  1880. Searching for Optimal Solutions with LLMs via Bayesian Optimization

  1881. QMP: Q-switch Mixture of Policies for Multi-Task Behavior Sharing

  1882. Confidence Elicitation: A New Attack Vector for Large Language Models

  1883. Injecting Universal Jailbreak Backdoors into LLMs in Minutes

  1884. High-Quality Joint Image and Video Tokenization with Causal VAE

  1885. Group-robust Sample Reweighting for Subpopulation Shifts via Influence Functions

  1886. Mitigate the Gap: Improving Cross-Modal Alignment in CLIP

  1887. FairDen: Fair Density-Based Clustering

  1888. Stabilized Neural Prediction of Potential Outcomes in Continuous Time

  1889. KBLaM: Knowledge Base augmented Language Model

  1890. Robust Feature Learning for Multi-Index Models in High Dimensions

  1891. Towards a General Time Series Anomaly Detector with Adaptive Bottlenecks and Dual Adversarial Decoders

  1892. ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning

  1893. Toward Understanding In-context vs. In-weight Learning

  1894. Does Refusal Training in LLMs Generalize to the Past Tense?

  1895. Wasserstein-Regularized Conformal Prediction under General Distribution Shift

  1896. Diff3DS: Generating View-Consistent 3D Sketch via Differentiable Curve Rendering

  1897. Self-Updatable Large Language Models by Integrating Context into Model Parameters

  1898. SaMer: A Scenario-aware Multi-dimensional Evaluator for Large Language Models

  1899. Investigating Pattern Neurons in Urban Time Series Forecasting

  1900. Retrieval Augmented Diffusion Model for Structure-informed Antibody Design and Optimization

  1901. CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes

  1902. Pangea: A Fully Open Multilingual Multimodal LLM for 39 Languages

  1903. Self-Play Preference Optimization for Language Model Alignment

  1904. One-for-All Few-Shot Anomaly Detection via Instance-Induced Prompt Learning

  1905. Rethinking Spiking Neural Networks from an Ensemble Learning Perspective

  1906. DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models

  1907. TLDR: Token-Level Detective Reward Model for Large Vision Language Models

  1908. PvNeXt: Rethinking Network Design and Temporal Motion for Point Cloud Video Recognition

  1909. Enhancing Language Model Agents using Diversity of Thoughts

  1910. SELF-EVOLVED REWARD LEARNING FOR LLMS

  1911. Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling

  1912. LDAdam: Adaptive Optimization from Low-Dimensional Gradient Statistics

  1913. InstantPortrait: One-Step Portrait Editing via Diffusion Multi-Objective Distillation

  1914. What Makes Large Language Models Reason in (Multi-Turn) Code Generation?

  1915. Generalized Consistency Trajectory Models for Image Manipulation

  1916. Complexity Lower Bounds of Adaptive Gradient Algorithms for Non-convex Stochastic Optimization under Relaxed Smoothness

  1917. Federated Few-Shot Class-Incremental Learning

  1918. Theory, Analysis, and Best Practices for Sigmoid Self-Attention

  1919. KLay: Accelerating Arithmetic Circuits for Neurosymbolic AI

  1920. Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models

  1921. Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo

  1922. ASTrA: Adversarial Self-supervised Training with Adaptive-Attacks

  1923. SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking

  1924. Actions Speak Louder Than Words: Rate-Reward Trade-off in Markov Decision Processes

  1925. Diff-2-in-1: Bridging Generation and Dense Perception with Diffusion Models

  1926. Truncated Consistency Models

  1927. From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks

  1928. A Policy-Gradient Approach to Solving Imperfect-Information Games with Best-Iterate Convergence

  1929. ARB-LLM: Alternating Refined Binarizations for Large Language Models

  1930. Radar: Fast Long-Context Decoding for Any Transformer

  1931. Self-Improving Robust Preference Optimization

  1932. Oracle efficient truncated statistics

  1933. Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Gate

  1934. Robust System Identification: Finite-sample Guarantees and Connection to Regularization

  1935. TopoDiffusionNet: A Topology-aware Diffusion Model

  1936. Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs

  1937. Simulating Training Dynamics to Reconstruct Training Data from Deep Neural Networks

  1938. Optimized Multi-Token Joint Decoding With Auxiliary Model for LLM Inference

  1939. InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation

  1940. Glad: A Streaming Scene Generator for Autonomous Driving

  1941. Improving Equivariant Networks with Probabilistic Symmetry Breaking

  1942. InterMask: 3D Human Interaction Generation via Collaborative Masked Modeling

  1943. Cross-Domain Off-Policy Evaluation and Learning for Contextual Bandits

  1944. PhyloVAE: Unsupervised Learning of Phylogenetic Trees via Variational Autoencoders

  1945. NEAR: A Training-Free Pre-Estimator of Machine Learning Model Performance

  1946. CameraCtrl: Enabling Camera Control for Video Diffusion Models

  1947. FlexCAD: Unified and Versatile Controllable CAD Generation with Fine-tuned Large Language Models

  1948. Scaling Laws for Adversarial Attacks on Language Model Activations and Tokens

  1949. Permute-and-Flip: An optimally stable and watermarkable decoder for LLMs

  1950. Elliptic Loss Regularization

  1951. Toward Efficient Multi-Agent Exploration With Trajectory Entropy Maximization

  1952. Interpretable Bilingual Multimodal Large Language Model for Diverse Biomedical Tasks

  1953. Concept Bottleneck Language Models For Protein Design

  1954. CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer

  1955. MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model

  1956. Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback

  1957. Learning 3D Perception from Others' Predictions

  1958. W-PCA Based Gradient-Free Proxy for Efficient Search of Lightweight Language Models

  1959. SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal

  1960. Dist Loss: Enhancing Regression in Few-Shot Region through Distribution Distance Constraint

  1961. Multilevel Generative Samplers for Investigating Critical Phenomena

  1962. Calibrating LLMs with Information-Theoretic Evidential Deep Learning

  1963. GOttack: Universal Adversarial Attacks on Graph Neural Networks via Graph Orbits Learning

  1964. Dissecting Adversarial Robustness of Multimodal LM Agents

  1965. BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models

  1966. A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement

  1967. COMBO: Compositional World Models for Embodied Multi-Agent Cooperation

  1968. On the Crucial Role of Initialization for Matrix Factorization

  1969. Pedestrian Motion Reconstruction: A Large-scale Benchmark via Mixed Reality Rendering with Multiple Perspectives and Modalities

  1970. Handling Delay in Real-Time Reinforcement Learning

  1971. Progressive Parameter Efficient Transfer Learning for Semantic Segmentation

  1972. BoneMet: An Open Large-Scale Multi-Modal Murine Dataset for Breast Cancer Bone Metastasis Diagnosis and Prognosis

  1973. PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer

  1974. Reconstruction-Guided Policy: Enhancing Decision-Making through Agent-Wise State Consistency

  1975. SigDiffusions: Score-Based Diffusion Models for Time Series via Log-Signature Embeddings

  1976. Efficient Interpolation between Extragradient and Proximal Methods for Weak MVIs

  1977. Infinite-Resolution Integral Noise Warping for Diffusion Models

  1978. Efficient stagewise pretraining via progressive subnetworks

  1979. Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems

  1980. EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing

  1981. GraphArena: Evaluating and Exploring Large Language Models on Graph Computation

  1982. OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code

  1983. Efficient Neuron Segmentation in Electron Microscopy by Affinity-Guided Queries

  1984. Improved Sampling Algorithms for Lévy-Itô Diffusion Models

  1985. Rapidly Adapting Policies to the Real-World via Simulation-Guided Fine-Tuning

  1986. MP-Mat: A 3D-and-Instance-Aware Human Matting and Editing Framework with Multiplane Representation

  1987. New Algorithms for the Learning-Augmented k-means Problem

  1988. Geometry of Lightning Self-Attention: Identifiability and Dimension

  1989. Unlocking Guidance for Discrete State-Space Diffusion and Flow Models

  1990. Attributing Culture-Conditioned Generations to Pretraining Corpora

  1991. UNIP: Rethinking Pre-trained Attention Patterns for Infrared Semantic Segmentation

  1992. IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement Learning

  1993. Deep Linear Probe Generators for Weight Space Learning

  1994. Beyond Circuit Connections: A Non-Message Passing Graph Transformer Approach for Quantum Error Mitigation

  1995. Neural ODE Transformers: Analyzing Internal Dynamics and Adaptive Fine-tuning

  1996. TSC-Net: Prediction of Pedestrian Trajectories by Trajectory-Scene-Cell Classification

  1997. Optimizing Backward Policies in GFlowNets via Trajectory Likelihood Maximization

  1998. COAT: Compressing Optimizer states and Activations for Memory-Efficient FP8 Training

  1999. Self-supervised Monocular Depth Estimation Robust to Reflective Surface Leveraged by Triplet Mining

  2000. See It from My Perspective: How Language Affects Cultural Bias in Image Understanding

  2001. 3D-SPATIAL MULTIMODAL MEMORY

  2002. Agree to Disagree: Demystifying Homogeneous Deep Ensembles through Distributional Equivalence

  2003. Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On

  2004. Graph Neural Networks for Edge Signals: Orientation Equivariance and Invariance

  2005. AnyTouch: Learning Unified Static-Dynamic Representation across Multiple Visuo-tactile Sensors

  2006. Contrastive Learning from Synthetic Audio Doppelgängers

  2007. Going Beyond Static: Understanding Shifts with Time-Series Attribution

  2008. StochSync: Stochastic Diffusion Synchronization for Image Generation in Arbitrary Spaces

  2009. Modeling Unseen Environments with Language-guided Composable Causal Components in Reinforcement Learning

  2010. ContraDiff: Planning Towards High Return States via Contrastive Learning

  2011. ToolGen: Unified Tool Retrieval and Calling via Generation

  2012. Query-based Knowledge Transfer for Heterogeneous Learning Environments

  2013. ProtoSnap: Prototype Alignment For Cuneiform Signs

  2014. DOCS: Quantifying Weight Similarity for Deeper Insights into Large Language Models

  2015. SimulPL: Aligning Human Preferences in Simultaneous Machine Translation

  2016. Residual Stream Analysis with Multi-Layer SAEs

  2017. A Statistical Approach for Controlled Training Data Detection

  2018. MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses

  2019. Fine-tuning can Help Detect Pretraining Data from Large Language Models

  2020. Centrality-guided Pre-training for Graph

  2021. Long-time asymptotics of noisy SVGD outside the population limit

  2022. MANTRA: The Manifold Triangulations Assemblage

  2023. Zero-Shot Natural Language Explanations

  2024. Everything is Editable: Extend Knowledge Editing to Unstructured Data in Large Language Models

  2025. Aria-MIDI: A Dataset of Piano MIDI Files for Symbolic Music Modeling

  2026. Multi-level Certified Defense Against Poisoning Attacks in Offline Reinforcement Learning

  2027. Decoupling Angles and Strength in Low-rank Adaptation

  2028. F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AI

  2029. Reassessing How to Compare and Improve the Calibration of Machine Learning Models

  2030. Breaking Class Barriers: Efficient Dataset Distillation via Inter-Class Feature Compensator

  2031. Semantic Aware Representation Learning for Lifelong Learning

  2032. Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA

  2033. GANDALF: Generative AttentioN based Data Augmentation and predictive modeLing Framework for personalized cancer treatment

  2034. Controllable Blur Data Augmentation Using 3D-Aware Motion Estimation

  2035. Optimal Non-Asymptotic Rates of Value Iteration for Average-Reward Markov Decision Processes

  2036. A Theoretical Perspective: How to Prevent Model Collapse in Self-consuming Training Loops

  2037. MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models

  2038. Contextual Document Embeddings

  2039. Anyprefer: An Agentic Framework for Preference Data Synthesis

  2040. Small Models are LLM Knowledge Triggers for Medical Tabular Prediction

  2041. GenDataAgent: On-the-fly Dataset Augmentation with Synthetic Data

  2042. GMValuator: Similarity-based Data Valuation for Generative Models

  2043. Optimal Learning of Kernel Logistic Regression for Complex Classification Scenarios

  2044. Building Math Agents with Multi-Turn Iterative Preference Learning

  2045. DyCAST: Learning Dynamic Causal Structure from Time Series

  2046. Three Mechanisms of Feature Learning in a Linear Network

  2047. Graph Neural Networks Gone Hogwild

  2048. Decoding Game: On Minimax Optimality of Heuristic Text Generation Strategies

  2049. I-Con: A Unifying Framework for Representation Learning

  2050. On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback

  2051. A Large-scale Dataset and Benchmark for Commuting Origin-Destination Flow Generation

  2052. Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning

  2053. Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization

  2054. Scaling Optimal LR Across Token Horizons

  2055. MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions

  2056. Towards Synergistic Path-based Explanations for Knowledge Graph Completion: Exploration and Evaluation

  2057. Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers

  2058. Discovering Influential Neuron Path in Vision Transformers

  2059. Re-Evaluating the Impact of Unseen-Class Unlabeled Data on Semi-Supervised Learning Model

  2060. Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-Probing

  2061. Scaling up Masked Diffusion Models on Text

  2062. REBIND: Enhancing Ground-state Molecular Conformation Prediction via Force-Based Graph Rewiring

  2063. Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation

  2064. Training Robust Ensembles Requires Rethinking Lipschitz Continuity

  2065. Does Spatial Cognition Emerge in Frontier Models?

  2066. Learning Multi-Index Models with Neural Networks via Mean-Field Langevin Dynamics

  2067. Rotated Runtime Smooth: Training-Free Activation Smoother for accurate INT4 inference

  2068. Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation

  2069. Distributional Associations vs In-Context Reasoning: A Study of Feed-forward and Attention Layers

  2070. Salvage: Shapley-distribution Approximation Learning Via Attribution Guided Exploration for Explainable Image Classification

  2071. PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance

  2072. Adaptive Pruning of Pretrained Transformer via Differential Inclusions

  2073. Variational Best-of-N Alignment

  2074. Data Center Cooling System Optimization Using Offline Reinforcement Learning

  2075. Adaptive Transformer Programs: Bridging the Gap Between Performance and Interpretability in Transformers

  2076. Don't Take Things Out of Context: Attention Intervention for Enhancing Chain-of-Thought Reasoning in Large Language Models

  2077. FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality

  2078. Decision Information Meets Large Language Models: The Future of Explainable Operations Research

  2079. Forget the Data and Fine-Tuning! Just Fold the Network to Compress

  2080. Differentially Private Federated Learning with Time-Adaptive Privacy Spending

  2081. TimeInf: Time Series Data Contribution via Influence Functions

  2082. Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model

  2083. HaDeMiF: Hallucination Detection and Mitigation in Large Language Models

  2084. Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent

  2085. Navigation-Guided Sparse Scene Representation for End-to-End Autonomous Driving

  2086. OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning

  2087. GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models

  2088. Fengbo: a Clifford Neural Operator pipeline for 3D PDEs in Computational Fluid Dynamics

  2089. Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

  2090. NeSyC: A Neuro-symbolic Continual Learner For Complex Embodied Tasks In Open Domains

  2091. An Effective Theory of Bias Amplification

  2092. K-HALU: Multiple Answer Korean Hallucination Benchmark for Large Language Models

  2093. Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning

  2094. Correlation and Navigation in the Vocabulary Key Representation Space of Language Models

  2095. CREAM: Consistency Regularized Self-Rewarding Language Models

  2096. Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos

  2097. ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding

  2098. MrT5: Dynamic Token Merging for Efficient Byte-level Language Models

  2099. Latent Action Pretraining from Videos

  2100. Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion

  2101. Transformer Encoder Satisfiability: Complexity and Impact on Formal Reasoning

  2102. CARTS: Advancing Neural Theorem Proving with Diversified Tactic Calibration and Bias-Resistant Tree Search

  2103. Bayesian Regularization of Latent Representation

  2104. DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

  2105. Convergence of Distributed Adaptive Optimization with Local Updates

  2106. Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for LLM Problem-Solving

  2107. Activation Gradient based Poisoned Sample Detection Against Backdoor Attacks

  2108. Designing Mechanical Meta-Materials by Learning Equivariant Flows

  2109. MCNC: Manifold-Constrained Reparameterization for Neural Compression

  2110. TypedThinker: Diversify Large Language Model Reasoning with Typed Thinking

  2111. SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection

  2112. Accelerating 3D Molecule Generation via Jointly Geometric Optimal Transport

  2113. Neural Dueling Bandits: Preference-Based Optimization with Human Feedback

  2114. Divergence of Neural Tangent Kernel in Classification Problems

  2115. Regret Bounds for Episodic Risk-Sensitive Linear Quadratic Regulator

  2116. How Low Can You Go? Searching for the Intrinsic Dimensionality of Complex Networks using Metric Node Embeddings

  2117. Advancing Prompt-Based Methods for Replay-Independent General Continual Learning

  2118. Robustness Auditing for Linear Regression: To Singularity and Beyond

  2119. Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents

  2120. RA-TTA: Retrieval-Augmented Test-Time Adaptation for Vision-Language Models

  2121. Decision Tree Induction Through LLMs via Semantically-Aware Evolution

  2122. Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning

  2123. Think Then React: Towards Unconstrained Action-to-Reaction Motion Generation

  2124. Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective

  2125. Triples as the Key: Structuring Makes Decomposition and Verification Easier in LLM-based TableQA

  2126. How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning

  2127. Protein Language Model Fitness is a Matter of Preference

  2128. Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Transformers

  2129. McEval: Massively Multilingual Code Evaluation

  2130. Privacy-Aware Lifelong Learning

  2131. MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models

  2132. Learning Graph Invariance by Harnessing Spuriosity

  2133. Mitigating Spurious Correlations in Zero-Shot Multimodal Models

  2134. The Breakdown of Gaussian Universality in Classification of High-dimensional Linear Factor Mixtures

  2135. Accurate and Scalable Graph Neural Networks via Message Invariance

  2136. Strong Preferences Affect the Robustness of Preference Models and Value Alignment

  2137. VICtoR: Learning Hierarchical Vision-Instruction Correlation Rewards for Long-horizon Manipulation

  2138. LeanAgent: Lifelong Learning for Formal Theorem Proving

  2139. BOFormer: Learning to Solve Multi-Objective Bayesian Optimization via Non-Markovian RL

  2140. Combining Induction and Transduction for Abstract Reasoning

  2141. Adversarial Training for Defense Against Label Poisoning Attacks

  2142. SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation

  2143. Matryoshka Multimodal Models

  2144. On Rollouts in Model-Based Reinforcement Learning

  2145. Information Theoretic Text-to-Image Alignment

  2146. Uncertainty Herding: One Active Learning Method for All Label Budgets

  2147. FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"

  2148. A Transfer Attack to Image Watermarks

  2149. Trajectory-LLM: A Language-based Data Generator for Trajectory Prediction in Autonomous Driving

  2150. Learning Hierarchical Polynomials of Multiple Nonlinear Features

  2151. Generalizable Motion Planning via Operator Learning

  2152. Efficient Causal Decision Making with One-sided Feedback

  2153. Certifying Language Model Robustness with Fuzzed Randomized Smoothing: An Efficient Defense Against Backdoor Attacks

  2154. Circuit Representation Learning with Masked Gate Modeling and Verilog-AIG Alignment

  2155. Polyrating: A Cost-Effective and Bias-Aware Rating System for LLM Evaluation

  2156. LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

  2157. Self-Supervised Diffusion Models for Electron-Aware Molecular Representation Learning

  2158. Exploring the Design Space of Visual Context Representation in Video MLLMs

  2159. Fair Submodular Cover

  2160. L3Ms — Lagrange Large Language Models

  2161. The 3D-PC: a benchmark for visual perspective taking in humans and machines

  2162. On the Optimal Memorization Capacity of Transformers

  2163. InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences

  2164. Efficient Cross-Episode Meta-RL

  2165. MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba

  2166. Adversaries With Incentives: A Strategic Alternative to Adversarial Robustness

  2167. A Formal Framework for Understanding Length Generalization in Transformers

  2168. Mask in the Mirror: Implicit Sparsification

  2169. Audio Large Language Models Can Be Descriptive Speech Quality Evaluators

  2170. Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents

  2171. Poisson-Dirac Neural Networks for Modeling Coupled Dynamical Systems across Domains

  2172. GPS: A Probabilistic Distributional Similarity with Gumbel Priors for Set-to-Set Matching

  2173. Multi-Task Dense Predictions via Unleashing the Power of Diffusion

  2174. Discovering Group Structures via Unitary Representation Learning

  2175. Newton Meets Marchenko-Pastur: Massively Parallel Second-Order Optimization with Hessian Sketching and Debiasing

  2176. Grokking at the Edge of Numerical Stability

  2177. MELODI: Exploring Memory Compression for Long Contexts

  2178. MIND: Math Informed syNthetic Dialogues for Pretraining LLMs

  2179. T2V2: A Unified Non-Autoregressive Model for Speech Recognition and Synthesis via Multitask Learning

  2180. FreCaS: Efficient Higher-Resolution Image Generation via Frequency-aware Cascaded Sampling

  2181. AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation

  2182. MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding

  2183. A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts

  2184. Methods with Local Steps and Random Reshuffling for Generally Smooth Non-Convex Federated Optimization

  2185. Estimation of single-cell and tissue perturbation effect in spatial transcriptomics via Spatial Causal Disentanglement

  2186. Zero-shot forecasting of chaotic systems

  2187. Learning Video-Conditioned Policy on Unlabelled Data with Joint Embedding Predictive Transformer

  2188. Real-time design of architectural structures with differentiable mechanics and neural networks

  2189. Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice

  2190. Bridging Information Asymmetry in Text-video Retrieval: A Data-centric Approach

  2191. Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process

  2192. Measuring And Improving Engagement of Text-to-Image Generation Models

  2193. Law of the Weakest Link: Cross Capabilities of Large Language Models

  2194. Graph Neural Preconditioners for Iterative Solutions of Sparse Linear Systems

  2195. UTILITY: Utilizing Explainable Reinforcement Learning to Improve Reinforcement Learning

  2196. Discriminator-Guided Embodied Planning for LLM Agent

  2197. State Space Model Meets Transformer: A New Paradigm for 3D Object Detection

  2198. MGCFNN: A Neural MultiGrid Solver with Novel Fourier Neural Network for High Wave Number Helmholtz Equations

  2199. The Belief State Transformer

  2200. Measuring memorization in RLHF for code completion

  2201. On the Relation between Trainability and Dequantization of Variational Quantum Learning Models

  2202. Boosting Neural Combinatorial Optimization for Large-Scale Vehicle Routing Problems

  2203. Epistemic Monte Carlo Tree Search

  2204. HShare: Fast LLM Decoding by Hierarchical Key-Value Sharing

  2205. HyperPLR: Hypergraph Generation through Projection, Learning, and Reconstruction

  2206. Reliable and Diverse Evaluation of LLM Medical Knowledge Mastery

  2207. Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models

  2208. DECO: Unleashing the Potential of ConvNets for Query-based Detection and Segmentation

  2209. Neural Sampling from Boltzmann Densities: Fisher-Rao Curves in the Wasserstein Geometry

  2210. Metalic: Meta-Learning In-Context with Protein Language Models

  2211. Gumbel Counterfactual Generation From Language Models

  2212. Solving Video Inverse Problems Using Image Diffusion Models

  2213. CLIPure: Purification in Latent Space via CLIP for Adversarially Robust Zero-Shot Classification

  2214. Relation-Aware Diffusion for Heterogeneous Graphs with Partially Observed Features

  2215. ClawMachine: Learning to Fetch Visual Tokens for Referential Comprehension

  2216. Physics-informed Temporal Difference Metric Learning for Robot Motion Planning

  2217. Noise Separation guided Candidate Label Reconstruction for Noisy Partial Label Learning

  2218. PolyNet: Learning Diverse Solution Strategies for Neural Combinatorial Optimization

  2219. FIRING-Net: A filtered feature recycling network for speech enhancement

  2220. Improving Neural Network Accuracy by Concurrently Training with a Twin Network

  2221. Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

  2222. VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks

  2223. Efficient Model Editing with Task-Localized Sparse Fine-tuning

  2224. Provence: efficient and robust context pruning for retrieval-augmented generation

  2225. Learning to Adapt Frozen CLIP for Few-Shot Test-Time Domain Adaptation

  2226. MUSE: Machine Unlearning Six-Way Evaluation for Language Models

  2227. Learning Harmonized Representations for Speculative Sampling

  2228. You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANs

  2229. PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks

  2230. CryoFM: A Flow-based Foundation Model for Cryo-EM Densities

  2231. ZooProbe: A Data Engine for Evaluating, Exploring, and Evolving Large-scale Training Data for Multimodal LLMs

  2232. Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models

  2233. Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining

  2234. Text-to-Image Rectified Flow as Plug-and-Play Priors

  2235. SpaceGNN: Multi-Space Graph Neural Network for Node Anomaly Detection with Extremely Limited Labels

  2236. LiveXiv - A Multi-Modal live benchmark based on Arxiv papers content

  2237. Data Unlearning in Diffusion Models

  2238. Adaptive backtracking line search

  2239. Fast Training of Sinusoidal Neural Fields via Scaling Initialization

  2240. When Prompt Engineering Meets Software Engineering: CNL-P as Natural and Robust "APIs'' for Human-AI Interaction

  2241. ComPC: Completing a 3D Point Cloud with 2D Diffusion Priors

  2242. Bridging the Gap between Database Search and \emph{De Novo} Peptide Sequencing with SearchNovo

  2243. Logical Consistency of Large Language Models in Fact-Checking

  2244. Vision-LSTM: xLSTM as Generic Vision Backbone

  2245. CTSyn: A Foundation Model for Cross Tabular Data Generation

  2246. P-SPIKESSM: HARNESSING PROBABILISTIC SPIKING STATE SPACE MODELS FOR LONG-RANGE DEPENDENCY TASKS

  2247. GameArena: Evaluating LLM Reasoning through Live Computer Games

  2248. Herald: A Natural Language Annotated Lean 4 Dataset

  2249. TabM: Advancing tabular deep learning with parameter-efficient ensembling

  2250. UNSURE: self-supervised learning with Unknown Noise level and Stein's Unbiased Risk Estimate

  2251. Spurious Forgetting in Continual Learning of Language Models

  2252. Release the Powers of Prompt Tuning: Cross-Modality Prompt Transfer

  2253. Heavy-Tailed Diffusion with Denoising Levy Probabilistic Models

  2254. Test-time Adaptation for Regression by Subspace Alignment

  2255. Accelerated Over-Relaxation Heavy-Ball Method: Achieving Global Accelerated Convergence with Broad Generalization

  2256. Locality Sensitive Avatars From Video

  2257. KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks

  2258. Conservative Contextual Bandits: Beyond Linear Representations

  2259. Is In-Context Learning Sufficient for Instruction Following in LLMs?

  2260. VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing

  2261. 3D Vision-Language Gaussian Splatting

  2262. Fat-to-Thin Policy Optimization: Offline Reinforcement Learning with Sparse Policies

  2263. Learning the Optimal Stopping for Early Classification within Finite Horizons via Sequential Probability Ratio Test

  2264. Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF

  2265. Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion

  2266. Language Models Are Implicitly Continuous

  2267. Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias

  2268. High-Dimensional Bayesian Optimisation with Gaussian Process Prior Variational Autoencoders

  2269. Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training

  2270. NExUME: Adaptive Training and Inference for DNNs under Intermittent Power Environments

  2271. TorchTitan: One-stop PyTorch native solution for production ready LLM pretraining

  2272. Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation

  2273. HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction

  2274. GROOT-2: Weakly Supervised Multimodal Instruction Following Agents

  2275. Redefining the task of Bioactivity Prediction

  2276. Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control

  2277. Can We Trust Embodied Agents? Exploring Backdoor Attacks against Embodied LLM-Based Decision-Making Systems

  2278. Concept-ROT: Poisoning Concepts in Large Language Models with Model Editing

  2279. Medium-Difficulty Samples Constitute Smoothed Decision Boundary for Knowledge Distillation on Pruned Datasets

  2280. Mixture of Attentions For Speculative Decoding

  2281. Reframing Structure-Based Drug Design Model Evaluation via Metrics Correlated to Practical Needs

  2282. Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors

  2283. A Riemannian Framework for Learning Reduced-order Lagrangian Dynamics

  2284. CLDyB: Towards Dynamic Benchmarking for Continual Learning with Pre-trained Models

  2285. Sensor-Invariant Tactile Representation

  2286. Diffusion Models as Cartoonists: The Curious Case of High Density Regions

  2287. 3DMolFormer: A Dual-channel Framework for Structure-based Drug Discovery

  2288. Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression

  2289. Shape as Line Segments: Accurate and Flexible Implicit Surface Representation

  2290. HOPE for a Robust Parameterization of Long-memory State Space Models

  2291. Unleashing the Power of Task-Specific Directions in Parameter Efficient Fine-tuning

  2292. Kronecker Mask and Interpretive Prompts are Language-Action Video Learners

  2293. Large Language Models Often Say One Thing and Do Another

  2294. A Theory of Initialisation's Impact on Specialisation

  2295. Can a Large Language Model be a Gaslighter?

  2296. Closed-Form Merging of Parameter-Efficient Modules for Federated Continual Learning

  2297. Local convergence of simultaneous min-max algorithms to differential equilibrium on Riemannian manifold

  2298. GS-LiDAR: Generating Realistic LiDAR Point Clouds with Panoramic Gaussian Splatting

  2299. DiffPC: Diffusion-based High Perceptual Fidelity Image Compression with Semantic Refinement

  2300. A Distributional Approach to Uncertainty-Aware Preference Alignment Using Offline Demonstrations

  2301. Progress or Regress? Self-Improvement Reversal in Post-training

  2302. Halton Scheduler for Masked Generative Image Transformer

  2303. Error-quantified Conformal Inference for Time Series

  2304. Concept Bottleneck Large Language Models

  2305. Separation Power of Equivariant Neural Networks

  2306. MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

  2307. Reflexive Guidance: Improving OoDD in Vision-Language Models via Self-Guided Image-Adaptive Concept Generation

  2308. Neural Interactive Proofs

  2309. PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection

  2310. SGD with memory: fundamental properties and stochastic acceleration

  2311. A Simple Framework for Open-Vocabulary Zero-Shot Segmentation

  2312. Model-Free Offline Reinforcement Learning with Enhanced Robustness

  2313. Is Your Multimodal Language Model Oversensitive to Safe Queries?

  2314. Near, far: Patch-ordering enhances vision foundation models' scene understanding

  2315. Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better

  2316. ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time

  2317. Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs

  2318. Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts

  2319. Do Deep Neural Network Solutions Form a Star Domain?

  2320. Intrinsic Dimension Correlation: uncovering nonlinear connections in multimodal representations

  2321. Fréchet Wavelet Distance: A Domain-Agnostic Metric for Image Generation

  2322. Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models

  2323. InstaTrain: Adaptive Training via Ultra-Fast Natural Annealing within Dynamical Systems

  2324. Transformers Handle Endogeneity in In-Context Linear Regression

  2325. GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding

  2326. Size-Generalizable RNA Structure Evaluation by Exploring Hierarchical Geometries

  2327. Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF

  2328. Unsupervised Model Tree Heritage Recovery

  2329. A3D: Does Diffusion Dream about 3D Alignment?

  2330. Continuous Diffusion for Mixed-Type Tabular Data

  2331. Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning

  2332. PEARL: Parallel Speculative Decoding with Adaptive Draft Length

  2333. Aligning Human Motion Generation with Human Perceptions

  2334. Microcanonical Langevin Ensembles: Advancing the Sampling of Bayesian Neural Networks

  2335. Scale-aware Recognition in Satellite Images under Resource Constraints

  2336. MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequences

  2337. Model-agnostic meta-learners for estimating heterogeneous treatment effects over time

  2338. Unleashing the Potential of Vision-Language Pre-Training for 3D Zero-Shot Lesion Segmentation via Mask-Attribute Alignment

  2339. State Space Models are Provably Comparable to Transformers in Dynamic Token Selection

  2340. ImageFolder: Autoregressive Image Generation with Folded Tokens

  2341. Model Equality Testing: Which Model is this API Serving?

  2342. Demystifying Topological Message-Passing with Relational Structures: A Case Study on Oversquashing in Simplicial Message-Passing

  2343. From an LLM Swarm to a PDDL-empowered Hive: Planning Self-executed Instructions in a Multi-modal Jungle

  2344. Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence

  2345. Generalized Behavior Learning from Diverse Demonstrations

  2346. SoftMatcha: A Soft and Fast Pattern Matcher for Billion-Scale Corpus Searches

  2347. Efficient Source-Free Time-Series Adaptation via Parameter Subspace Disentanglement

  2348. Do Contemporary Causal Inference Models Capture Real-World Heterogeneity? Findings from a Large-Scale Benchmark

  2349. Towards Self-Supervised Covariance Estimation in Deep Heteroscedastic Regression

  2350. Underdamped Diffusion Bridges with Applications to Sampling

  2351. A Closer Look at Machine Unlearning for Large Language Models

  2352. A Robust Method to Discover Causal or Anticausal Relation

  2353. Debiasing Mini-Batch Quadratics for Applications in Deep Learning

  2354. Scalable Benchmarking and Robust Learning for Noise-Free Ego-Motion and 3D Reconstruction from Noisy Video

  2355. SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction

  2356. EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing

  2357. Predicting the Energy Landscape of Stochastic Dynamical System via Physics-informed Self-supervised Learning

  2358. Beyond Random Masking: When Dropout meets Graph Convolutional Networks

  2359. Tight Clusters Make Specialized Experts

  2360. Domain Guidance: A Simple Transfer Approach for a Pre-trained Diffusion Model

  2361. RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards

  2362. Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

  2363. Gap Preserving Distillation by Building Bidirectional Mappings with A Dynamic Teacher

  2364. SegLLM: Multi-round Reasoning Segmentation with Large Language Models

  2365. Should VLMs be Pre-trained with Image Data?

  2366. InfoGS: Efficient Structure-Aware 3D Gaussians via Lightweight Information Shaping

  2367. Transformers Can Learn Temporal Difference Methods for In-Context Reinforcement Learning

  2368. CL-DiffPhyCon: Closed-loop Diffusion Control of Complex Physical Systems

  2369. LLM-wrapper: Black-Box Semantic-Aware Adaptation of Vision-Language Models for Referring Expression Comprehension

  2370. Deep MMD Gradient Flow without adversarial training

  2371. A transfer learning framework for weak to strong generalization

  2372. What to align in multimodal contrastive learning?

  2373. GenVP: Generating Visual Puzzles with Contrastive Hierarchical VAEs

  2374. Towards Continuous Reuse of Graph Models via Holistic Memory Diversification

  2375. Lightweight Predictive 3D Gaussian Splats

  2376. Universal Image Restoration Pre-training via Degradation Classification

  2377. Connectome Mapping: Shape-Memory Network via Interpretation of Contextual Semantic Information

  2378. LLaMA-Omni: Seamless Speech Interaction with Large Language Models

  2379. SONICS: Synthetic Or Not - Identifying Counterfeit Songs

  2380. Execution-guided within-prompt search for programming-by-example

  2381. Interpretable Unsupervised Joint Denoising and Enhancement for Real-World low-light Scenarios

  2382. A Training-Free Sub-quadratic Cost Transformer Model Serving Framework with Hierarchically Pruned Attention

  2383. ReSi: A Comprehensive Benchmark for Representational Similarity Measures

  2384. Autoregressive Pretraining with Mamba in Vision

  2385. Improved Training Technique for Latent Consistency Models

  2386. Mitigating Reward Over-Optimization in RLHF via Behavior-Supported Regularization

  2387. Gaussian Ensemble Belief Propagation for Efficient Inference in High-Dimensional, Black-box Systems

  2388. MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance

  2389. Generalization and Distributed Learning of GFlowNets

  2390. Humanizing the Machine: Proxy Attacks to Mislead LLM Detectors

  2391. Unlocking Efficient, Scalable, and Continual Knowledge Editing with Basis-Level Representation Fine-Tuning

  2392. RTop-K: Ultra-Fast Row-Wise Top-K Selection for Neural Network Acceleration on GPUs

  2393. Learning Robust Representations with Long-Term Information for Generalization in Visual Reinforcement Learning

  2394. Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Model Alignment

  2395. Select before Act: Spatially Decoupled Action Repetition for Continuous Control

  2396. Can Video LLMs Refuse to Answer? Alignment for Answerability in Video Large Language Models

  2397. Diffusion Models Are Real-Time Game Engines

  2398. Swift Hydra: Self-Reinforcing Generative Framework for Anomaly Detection with Multiple Mamba Models

  2399. QuaDiM: A Conditional Diffusion Model For Quantum State Property Estimation

  2400. Expected Sliced Transport Plans

  2401. B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

  2402. Provable Benefit of Annealed Langevin Monte Carlo for Non-log-concave Sampling

  2403. Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning

  2404. ACES: Automatic Cohort Extraction System for Event-Stream Datasets

  2405. GaussianAnything: Interactive Point Cloud Flow Matching for 3D Generation

  2406. InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales

  2407. Structure Language Models for Protein Conformation Generation

  2408. $q$-exponential family for policy optimization

  2409. GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement

  2410. Understanding Long Videos with Multimodal Language Models

  2411. Matérn Kernels for Tunable Implicit Surface Reconstruction

  2412. No Free Lunch: Fundamental Limits of Learning Non-Hallucinating Generative Models

  2413. An Online Learning Theory of Trading-Volume Maximization

  2414. ADAM: An Embodied Causal Agent in Open-World Environments

  2415. Learning High-Degree Parities: The Crucial Role of the Initialization

  2416. Endowing Visual Reprogramming with Adversarial Robustness

  2417. Unlocking the Potential of Model Calibration in Federated Learning

  2418. Earlier Tokens Contribute More: Learning Direct Preference Optimization From Temporal Decay Perspective

  2419. Non-myopic Generation of Language Models for Reasoning and Planning

  2420. Fast Direct: Query-Efficient Online Black-box Guidance for Diffusion-model Target Generation

  2421. Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction

  2422. Beyond Content Relevance: Evaluating Instruction Following in Retrieval Models

  2423. A Theoretical Framework for Partially-Observed Reward States in RLHF

  2424. Distance-Based Tree-Sliced Wasserstein Distance

  2425. ET-SEED: EFFICIENT TRAJECTORY-LEVEL SE(3) EQUIVARIANT DIFFUSION POLICY

  2426. Wavelet-based Positional Representation for Long Context

  2427. Robotouille: An Asynchronous Planning Benchmark for LLM Agents

  2428. Towards Unified Human Motion-Language Understanding via Sparse Interpretable Characterization

  2429. Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits

  2430. Rethinking Classifier Re-Training in Long-Tailed Recognition: Label Over-Smooth Can Balance

  2431. Nonasymptotic Analysis of Stochastic Gradient Descent with the Richardson–Romberg Extrapolation

  2432. SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects

  2433. Scalable Mechanistic Neural Networks

  2434. Inverse Attention Agents for Multi-Agent Systems

  2435. Vertical Federated Learning with Missing Features During Training and Inference

  2436. Progressive Mixed-Precision Decoding for Efficient LLM Inference

  2437. Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel

  2438. DoF: A Diffusion Factorization Framework for Offline Multi-Agent Reinforcement Learning

  2439. KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models

  2440. DS-LLM: Leveraging Dynamical Systems to Enhance Both Training and Inference of Large Language Models

  2441. Self-supervised contrastive learning performs non-linear system identification

  2442. SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration

  2443. SIM: Surface-based fMRI Analysis for Inter-Subject Multimodal Decoding from Movie-Watching Experiments

  2444. OpenHands: An Open Platform for AI Software Developers as Generalist Agents

  2445. Procedural Synthesis of Synthesizable Molecules

  2446. Resolution Attack: Exploiting Image Compression to Deceive Deep Neural Networks

  2447. Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution

  2448. A Sanity Check for AI-generated Image Detection

  2449. Deep Random Features for Scalable Interpolation of Spatiotemporal Data

  2450. Meta-Continual Learning of Neural Fields

  2451. SmartRAG: Jointly Learn RAG-Related Tasks From the Environment Feedback

  2452. Erasing Concept Combination from Text-to-Image Diffusion Model

  2453. SparsyFed: Sparse Adaptive Federated Learning

  2454. Model-based Offline Reinforcement Learning with Lower Expectile Q-Learning

  2455. Maintaining Structural Integrity in Parameter Spaces for Parameter Efficient Fine-tuning

  2456. Arithmetic Without Algorithms: Language Models Solve Math with a Bag of Heuristics

  2457. Bayesian Treatment of the Spectrum of the Empirical Kernel in (Sub)Linear-Width Neural Networks

  2458. Few for Many: Tchebycheff Set Scalarization for Many-Objective Optimization

  2459. TPO: Aligning Large Language Models with Multi-branch & Multi-step Preference Trees

  2460. Direct Distributional Optimization for Provable Alignment of Diffusion Models

  2461. TRENDy: Temporal Regression of Effective Nonlinear Dynamics

  2462. Equivariant Masked Position Prediction for Efficient Molecular Representation

  2463. Decoupled Graph Energy-based Model for Node Out-of-Distribution Detection on Heterophilic Graphs

  2464. Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction

  2465. From Search to Sampling: Generative Models for Robust Algorithmic Recourse

  2466. Optimal Flow Transport and its Entropic Regularization: a GPU-friendly Matrix Iterative Algorithm for Flow Balance Satisfaction

  2467. Aligned LLMs Are Not Aligned Browser Agents

  2468. ProtPainter: Draw or Drag Protein via Topology-guided Diffusion

  2469. Learning Mask Invariant Mutual Information for Masked Image Modeling

  2470. Understanding Virtual Nodes: Oversquashing and Node Heterogeneity

  2471. A Watermark for Order-Agnostic Language Models

  2472. Hyperbolic Genome Embeddings

  2473. RefactorBench: Evaluating Stateful Reasoning in Language Agents Through Code

  2474. Probabilistic Conformal Prediction with Approximate Conditional Validity

  2475. Measuring And Improving Persuasiveness Of Large Language Models

  2476. Many-Objective Multi-Solution Transport

  2477. Text2PDE: Latent Diffusion Models for Accessible Physics Simulation

  2478. On the Expressive Power of Sparse Geometric MPNNs

  2479. Evaluating Semantic Variation in Text-to-Image Synthesis: A Causal Perspective

  2480. ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints

  2481. EffoVPR: Effective Foundation Model Utilization for Visual Place Recognition

  2482. Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs

  2483. Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning

  2484. Preserving Diversity in Supervised Fine-Tuning of Large Language Models

  2485. Zero-shot Imputation with Foundation Inference Models for Dynamical Systems

  2486. CBraMod: A Criss-Cross Brain Foundation Model for EEG Decoding

  2487. Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping

  2488. Dynamic Modeling of Patients, Modalities and Tasks via Multi-modal Multi-task Mixture of Experts

  2489. [Plastic Learning with Deep Fourier Features](iclr_src/ICLR_2025_Main_Papers

About

This is a repository dedicated to high quality figures from ICLR 2025 papers.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published