MoNaCo is a human written benchmark of natural complex questions annotated with their full reasoning chains, intermediate questions, answers and evidence. MoNaCo has 1,315 complex questions with each question requiring evidence from 44.3 different Wiki pages on average.
This repository hosts the MoNaCo website and leaderboard.
For more details check out our TACL paper "MoNaCo: More Natural and Complex Questions for Reasoning Across Dozens of Documents", and website.
For the MoNaCo data, please refer to our HuggingFace repository at: https://huggingface.co/datasets/allenai/MoNaCo_Benchmark.
- Key Links
- MoNaCo Dataset: Download
- Paper: "MoNaCo: More Natural and Complex Questions for Reasoning Across Dozens of Documents"
- Code: Coming soon
- Leaderboard:
- Leaderboard table
- LLM-judge prompt: https://github.com/tomerwolgithub/monaco/tree/main/prompts
- Website: https://tomerwolgithub.github.io/monaco/
- Huggingface dataset: https://huggingface.co/datasets/allenai/MoNaCo_Benchmark
8/17/2025The codebase will be released soon.8/17/2025Check our the official Ai2 blogpost.8/17/2025MoNaCo has been accepted to appear in TACL, check our preprint available here.8/17/2025The full Break dataset has been released see the HuggingFace dataset!
@article{wolfson-etal-2025-monaco,
title = "MoNaCo: More Natural and Complex Questions for Reasoning Across Dozens of Documents",
author = "Wolfson, Tomer and
Trivedi, Harsh and
Geva, Mor and
Goldberg, Yoav and
Roth, Dan and
Khot, Tushar and
Sabharwal, Ashish and
Tsarfaty, Reut",
journal = "Transactions of the Association for Computational Linguistics",
address = "Cambridge, MA",
publisher = "MIT Press",
year="2025",
}