FrontierCO: A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial Optimization

FrontierCO is a curated benchmark suite for evaluating ML-based solvers on large-scale and real-world Combinatorial Optimization (CO) problems. The benchmark spans 8 classical CO problems across 5 application domains, providing both training and evaluation instances specifically designed to test the contemporary ML-based CO solvers in solving NP-hard problems.

Combinatorial optimization plays a fundamental role in discrete mathematics, computer science, and operations research, with applications in routing, scheduling, allocation, and more. As ML-based solvers evolve—ranging from neural networks to symbolic reasoning with large language models—FrontierCO offers the first comprehensive dataset suite tailored to test these solvers at realistic scales and difficulties.

See more details about FrontierCO in our paper. Please contact [email protected] or [email protected].

Download Data

Download the raw data from https://huggingface.co/datasets/CO-Bench/FrontierCO to the local directory data

from huggingface_hub import snapshot_download

snapshot_download(
    repo_id='CO-Bench/FrontierCO',
    repo_type='dataset',
    local_dir='data'
)

Classical Solvers and Neural Training Data

Please refer to the instructions under each problem folder under data for how to:

apply the human-designed classical solvers
generate the training data for neural solvers

Neural Solver Evaluation

All the neural solvers evaluated in this work are open-sourced models. Please refer to their official GitHub repos for the training and evaluation code.

Agent Evaluation

Check out Evaluation-on-Frontierco

Below is code to run evaluation of Greedy Refinement agent on CFLP for 64 iterations with 300s timeout.

# We use new agent implementations in FrontierCO:
from agents import YieldGreedyRefine, YieldFunSearch, YieldReEvo

# And a new evaluator to fetch solutions yielded by the solver,
# evaluating only the last solution before timeout:
from evaluation import YieldingEvaluator, get_new_data

# Load data
data = get_new_data(task, src_dir='data', data_dir='data')

# Define agent (example: YieldGreedyRefine)
agent = YieldGreedyRefine(
    problem_description=data.problem_description,
    timeout=300,  # 300s timeout during solver development
    model='openai/o3-mini',  # We use LiteLLM to call the API
)

# Load YieldingEvaluator
# 300s timeout during solver development
evaluator = YieldingEvaluator(data, timeout=300)

# Run for 64 iterations
for it in range(64):
    code = agent.step()
    if code is None:  # agent decides to terminate
        break
    feedback = evaluator.evaluate(code)  # Run evaluation
    agent.feedback(feedback.dev_score, feedback.dev_feedback)  # Use dev set score as feedback

# Get the final solution
code = agent.finalize()

# For final evaluation, run the solver for 1 hour
final_evaluator = YieldingEvaluator(data, timeout=60 * 60)
feedback = final_evaluator.evaluate(code)
print(feedback.test_feedback)  # Test set score

Agent Implementations

Agents are implemented in the agents module. Currently supported agents include: GreedyRefine, DirectAnswer, BestOfN, FunSearch (link), AIDE (link), ChainOfExperts (link), and ReEvo (link). LLMs are supported via liteLLM.

Each agent implements the following functions:

step(): Returns the next candidate code for evaluation.

Citation

@misc{feng2025comprehensiveevaluationcontemporarymlbased,
      title={A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial Optimization}, 
      author={Shengyu Feng and Weiwei Sun and Shanda Li and Ameet Talwalkar and Yiming Yang},
      year={2025},
      eprint={2505.16952},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.16952}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
agents		agents
data		data
evaluation		evaluation
.gitignore		.gitignore
README.md		README.md
compute_baseline_results.py		compute_baseline_results.py
overview.png		overview.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FrontierCO: A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial Optimization

Download Data

Classical Solvers and Neural Training Data

Neural Solver Evaluation

Agent Evaluation

Agent Implementations

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FrontierCO: A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial Optimization

Download Data

Classical Solvers and Neural Training Data

Neural Solver Evaluation

Agent Evaluation

Agent Implementations

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages