Thanks to visit codestin.com
Credit goes to www.alphaxiv.org

Discover, Discuss, and Read arXiv papers

Discover new, recommended papers

GitHub
Stanford University logoStanford University
Researchers from Tsinghua University, Stanford University, and the Max Planck Institute for Informatics developed a deterministic algorithm for the single-source shortest path problem on directed graphs with non-negative real edge weights. The algorithm achieves O(m log^(2/3) n) time complexity, marking the first time the long-standing O(m + n log n) sorting barrier has been surpassed in the comparison-addition model for this problem.
65
The Agentic Context Engineering (ACE) framework dynamically evolves and curates comprehensive 'playbook' contexts for large language models, allowing them to continuously improve performance. This enables smaller, open-source models to match or exceed proprietary LLM agent performance on benchmarks like AppWorld, simultaneously reducing adaptation latency by up to 91.5% and token cost by 83.6%.
4
Direct Preference Optimization (DPO) introduces a method for fine-tuning large language models to align with human preferences that avoids the complexity of Reinforcement Learning from Human Feedback (RLHF). It reparameterizes the RLHF objective, allowing direct policy optimization and matching or exceeding PPO-based methods in performance and stability across summarization and dialogue tasks.
2,400
FlowRL reimagines large language model post-training by optimizing for reward distribution matching instead of simple reward maximization, yielding improved accuracy on math and code benchmarks and substantially greater diversity in reasoning paths.
92
OpenVLA introduces a fully open-source, 7B-parameter Vision-Language-Action model that sets a new state of the art for generalist robot manipulation, outperforming larger closed-source models by 16.5% absolute success rate. The model also demonstrates effective and efficient fine-tuning strategies for adapting to new robot setups and tasks on commodity hardware.
2,361
This paper introduces Energy-Based Transformers (EBTs), a new class of models that enable scalable System 2 thinking through unsupervised learning by reframing prediction as an optimization process over a learned energy function. EBTs demonstrate superior scaling rates compared to standard Transformers in language and video, improve performance by up to 29% with increased inference-time computation, and achieve better generalization on out-of-distribution data across diverse modalities.
516
FlashAttention, developed at Stanford's Hazy Research lab, introduces an exact attention algorithm that optimizes memory access patterns to mitigate the IO bottleneck on GPUs. This method achieves substantial speedups and reduces memory footprint for Transformer models, enabling processing of significantly longer sequence lengths.
3
The RLAD framework enables large language models to self-discover and leverage high-level reasoning abstractions, leading to substantial improvements in accuracy and compute efficiency on challenging mathematical reasoning tasks. This approach teaches models to propose and utilize concise procedural and factual knowledge to guide complex problem-solving.
107
Researchers at Stanford, UW, and AI2 developed `s1-32B`, an open-source model that achieves state-of-the-art reasoning performance and clear test-time scaling on challenging benchmarks by fine-tuning on only 1,000 high-quality reasoning samples and employing a simple 'budget forcing' inference technique.
5,483
vLLM introduces an innovative LLM serving system, employing PagedAttention for efficient Key-Value cache management by drawing inspiration from operating system paging. This system achieves 2-4x higher throughput compared to existing solutions by eliminating memory fragmentation and enabling dynamic memory allocation for LLM inference.
44,236
·
Researchers at Stanford University and Google present 'generative agents,' an architecture that extends large language models to create interactive simulacra of human behavior. The system incorporates dynamic memory, recursive reflection, and hierarchical planning, enabling individual agents to act believably and fostering emergent social dynamics like information diffusion and relationship formation in a simulated environment.
18,388
ControlNet enables large, pretrained text-to-image diffusion models to accept fine-grained spatial and structural control through various image-based conditions, such as edge maps or human poses. The Stanford University team achieved this by adding a trainable copy of the model's U-Net blocks that learns to incorporate conditions, while the original model's parameters remain frozen, preserving its capabilities and allowing efficient training on smaller datasets.
5
Researchers at Cornell Tech developed Block Discrete Denoising Diffusion Language Models (BD3-LMs), a new class of generative models that combine autoregressive generation over text blocks with discrete diffusion within each block. This approach significantly improves perplexity, achieves state-of-the-art results for discrete diffusion models, and enables flexible-length text generation with improved inference efficiency via KV caching.
269
Researchers from Northwestern University, Stanford, NYU, and Microsoft developed StarPO, a reinforcement learning framework, and RAGEN, a modular system, to train self-evolving large language model agents in multi-turn environments. The work identifies a new instability called the "Echo Trap" and proposes StarPO-S, which uses uncertainty-based filtering and gradient shaping to achieve more robust training and higher success rates across various interactive tasks.
2,294
· +2
A comprehensive, brain-inspired framework integrates diverse research areas of LLM-based intelligent agents, encompassing individual architecture, collaborative systems, and safety. The framework formally conceptualizes agent components, maps AI capabilities to human cognition to identify research gaps, and outlines a roadmap for developing autonomous, adaptive, and safe AI.
596
Denoising Diffusion Implicit Models (DDIMs) from Stanford University generalize the diffusion process, enabling existing diffusion models to generate high-quality samples 10x to 50x faster. This approach also unlocks capabilities like semantically meaningful latent space interpolation and accurate image reconstruction, making diffusion models more practical.
1,552
Researchers at Stanford University developed an Optimized Fine-Tuning (OFT) recipe for Vision-Language-Action (VLA) models, substantially increasing their inference speed and task success on new robotic setups, including complex bimanual manipulation, by integrating parallel decoding, action chunking, and L1 regression with continuous actions.
539
A gating mechanism applied to the output of scaled dot-product attention in large language models improves training stability and performance across benchmarks while mitigating attention sink issues, demonstrated through extensive experiments on 15B parameter MoE models and 1.7B dense models trained on 3.5 trillion tokens.
90
BeyondMimic, a collaborative effort from UC Berkeley and Stanford University, provides a unified framework for robust humanoid control on physical hardware, combining high-fidelity motion tracking and versatile, task-specific control via guided diffusion. The system successfully executes a wide range of dynamic human motions on a Unitree G1 robot without external motion capture, also enabling zero-shot, goal-driven behaviors.
914
Researchers from Stanford University, Meta, and UC Berkeley developed ALOHA, a low-cost, open-source hardware system, alongside ACT, a novel imitation learning algorithm, to enable precise fine-grained bimanual manipulation on affordable robots. The system successfully executes complex tasks such as threading zip cable ties, manipulating small objects, and juggling ping pong balls, demonstrating how advanced learning can compensate for hardware limitations.
1,205
There are no more papers matching your filters at the moment.