Your Bench
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
YourBench is an open-source framework for generating zero-shot benchmarks from your own documents. It helps you test language models on custom domains using automated pipelines for ingestion, summarization, and question generation.
- π Build benchmarks from PDFs, HTML, or text files
- π§ Generate both single-hop and multi-hop questions
- π Evaluate top models and deploy leaderboards instantly
- π οΈ Fully configurable via a single YAML file
Built with π€ by the OpenEvals team β GitHub
-
yourbench/yourbench_reproduction_o4mini_biology
Viewer β’ Updated β’ 1.83k β’ 10 -
yourbench/yourbench_reproduction_o4mini_business
Viewer β’ Updated β’ 829 β’ 8 -
yourbench/yourbench_reproduction_o4mini_chemistry
Viewer β’ Updated β’ 805 β’ 11 -
yourbench/yourbench_reproduction_o4mini_computerscience
Viewer β’ Updated β’ 1.81k β’ 4
-
yourbench/yourbench_reproduction_o4mini_biology
Viewer β’ Updated β’ 1.83k β’ 10 -
yourbench/yourbench_reproduction_o4mini_business
Viewer β’ Updated β’ 829 β’ 8 -
yourbench/yourbench_reproduction_o4mini_chemistry
Viewer β’ Updated β’ 805 β’ 11 -
yourbench/yourbench_reproduction_o4mini_computerscience
Viewer β’ Updated β’ 1.81k β’ 4
spaces
7
Running
on
CPU Upgrade
40
YourBench
π
Generate custom evaluations from your data easily!
Sleeping
Essential Web Medical
π
Select and annotate high-quality web documents
Sleeping
View Essentialweb Cleaned
π
Sleeping
Reachy Trivia
π
Trivia Questions For The Reachy Mini and Reachy Team!
Runtime error
Essential Web Annotation
π
Annotating Essential Web!
Running
Visualize Expert Level Filter
π₯
Browse and inspect classified documents from a dataset
models
0
None public yet
datasets
84
yourbench/childrens_books_questions
Viewer
β’
Updated
β’
62
β’
13
yourbench/mckinsey_great_trade_global_report
Viewer
β’
Updated
β’
511
β’
52
yourbench/aws_bedrock_documentation_demo
Viewer
β’
Updated
β’
1.18k
β’
15
yourbench/yourbench-custom-prompts-example-gpt-4.1
Viewer
β’
Updated
β’
55
β’
24
yourbench/yourbench-custom-prompts-example-oss-120b
Viewer
β’
Updated
β’
3
β’
13
yourbench/yourbench-custom-prompts-example
Viewer
β’
Updated
β’
52
β’
41
yourbench/yourbench-simple-example
Viewer
β’
Updated
β’
46
β’
19
yourbench/mckinsey_state_of_ai_doc_understanding
Viewer
β’
Updated
β’
29
β’
136
yourbench/highpass-medfilter-v2
Viewer
β’
Updated
β’
465
β’
20
yourbench/highpassfilter-medical-documents-o4-mini
Viewer
β’
Updated
β’
465
β’
10