SpecCache

This repository contains the data and code for paper What Limits Agentic Systems Efficiency?.

Benchmark 🛠️ • Agent ⚙️ • Contributing 🐜 •

Benchmark

The code used to measure LLM latency (as described in Section 2 of the paper) is provided in the benchmark folder.

Environment Setup

To get started on benchmarking, please first setup the environment:

cd benchmark
conda create -n api python=3.10
pip install -r requirements.txt

Configure API Key

Replace OAI_API_KEY in utils_completions.py and utils_completions_priority.py with your API key.

Run Experiments

To run the standard LLM latency measurement:

python benchmark/parallel/run.py

To enable OpenAI priority processing, run:

python benchmark/parallel_priority/run.py

To evaluate latency across different LLM providers, update the respective API key variables in utils_completions.py: DS_API_KEY (DeepSeek), TOGETHER_API_KEY (Together AI), ANTHROPIC_API_KEY (Anthropic), GOOGLE_API_KEY (Google), and CENTML_API_KEY (CentML).

Agent

Environment Setup

SpecCache is implemented on top of Qwen WebWalker Agent. To get started on running SpecCache Agent (as described in Section 3&4 of the paper), please set up the environment the following way:

cd SpecCache
conda create -n speccache python=3.10
pip install -r requirements.txt
crawl4ai-setup
crawl4ai-doctor

Configure API Key

Replace api_key, provider, and model_server in speccache_webwalkerqa_example.py with your own setup.

Run Experiments

To run the agent:

cd SpecCache
python speccache_webwalkerqa_example.py

Currently, the agent tests on the English subset (provided by the WebWalker_QA dataset), feel free to change the dataset and the dataset parsing code to deploy on a wider variety of datasets.

Contributing

Authors: Song Bian*, Minghao Yan*, Anand Jayarajan, Gennady Pekhimenko, Shivaram Venkataraman

Affiliated: University of Wisconsin-Madison, University of Toronto and NVIDIA.

Citation

If you find the idea or code useful for your research, please consider citing our paper:

@article{bian2025limits,
  title={What Limits Agentic Systems Efficiency?},
  author={Bian, Song and Yan, Minghao and Jayarajan, Anand and Pekhimenko, Gennady and Venkataraman, Shivaram},
  journal={arXiv preprint arXiv:2510.16276},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
SpecCache		SpecCache
benchmark		benchmark
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SpecCache

Benchmark

Environment Setup

Configure API Key

Run Experiments

Agent

Environment Setup

Configure API Key

Run Experiments

Contributing

Citation

About

Uh oh!

Releases

Packages

Languages

License

Waterpine/SpecCache

Folders and files

Latest commit

History

Repository files navigation

SpecCache

Benchmark

Environment Setup

Configure API Key

Run Experiments

Agent

Environment Setup

Configure API Key

Run Experiments

Contributing

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages