This repository contains the data and code for paper What Limits Agentic Systems Efficiency?.
Benchmark 🛠️ • Agent ⚙️ • Contributing 🐜 •
The code used to measure LLM latency (as described in Section 2 of the paper) is provided in the benchmark folder.
To get started on benchmarking, please first setup the environment:
cd benchmark
conda create -n api python=3.10
pip install -r requirements.txtReplace OAI_API_KEY in utils_completions.py and utils_completions_priority.py with your API key.
To run the standard LLM latency measurement:
python benchmark/parallel/run.pyTo enable OpenAI priority processing, run:
python benchmark/parallel_priority/run.pyTo evaluate latency across different LLM providers, update the respective API key variables in utils_completions.py:
DS_API_KEY (DeepSeek), TOGETHER_API_KEY (Together AI), ANTHROPIC_API_KEY (Anthropic), GOOGLE_API_KEY (Google), and CENTML_API_KEY (CentML).
SpecCache is implemented on top of Qwen WebWalker Agent. To get started on running SpecCache Agent (as described in Section 3&4 of the paper), please set up the environment the following way:
cd SpecCache
conda create -n speccache python=3.10
pip install -r requirements.txt
crawl4ai-setup
crawl4ai-doctorReplace api_key, provider, and model_server in speccache_webwalkerqa_example.py with your own setup.
To run the agent:
cd SpecCache
python speccache_webwalkerqa_example.pyCurrently, the agent tests on the English subset (provided by the WebWalker_QA dataset), feel free to change the dataset and the dataset parsing code to deploy on a wider variety of datasets.
Authors: Song Bian*, Minghao Yan*, Anand Jayarajan, Gennady Pekhimenko, Shivaram Venkataraman
Affiliated: University of Wisconsin-Madison, University of Toronto and NVIDIA.
If you find the idea or code useful for your research, please consider citing our paper:
@article{bian2025limits,
title={What Limits Agentic Systems Efficiency?},
author={Bian, Song and Yan, Minghao and Jayarajan, Anand and Pekhimenko, Gennady and Venkataraman, Shivaram},
journal={arXiv preprint arXiv:2510.16276},
year={2025}
}