ScImage V2 Dataset

Introduction

We introduce ScImage V2, a benchmark dataset for evaluating scientific image generation across four major domains: biology, mathematics, computer science, and physics. ScImage V2 features an expanded terminology set, diverse mathematical functions, and a broad range of chart types. It includes over 2,000 high-quality, template-based text-to-image pairs designed to evaluate fine-grained scientific image generation capabilities. The dataset supports the development and assessment of more capable and reliable multimodal LLMs for scientific applications.

Directory Overview

Chart_Types: JSON files specifying all chart types used
Filled_Templates: CSVs of all 10 filled template batches (prompt + output)
Groupings: Grouped terms used to fill templates
Human_Evals: Human annotations for template evaluation, correction, and filtering
Plots: PDFs of all chart visuals used in the accompanying paper
Python_Code_1000: Python scripts for the first 1000 template examples
Python_Images_291: Rendered Python-generated images (291 samples)
TikZ_Code_1000: TikZ code for the first 1000 template examples
TikZ_Images_291: Rendered TikZ-generated images (291 samples)
ScImage_V1: Prompts and templates from ScImage V1 for baseline comparison
Scripts: Code for extracting chart types, generating plots, and filling templates
Templates: Human-curated templates (domain terms, math functions, charts)
ScImage_V2_Presentation: ScImage V2 Dataset Presentation
ScImage_V2_Paper: ScImage V2 Dataset Paper

Filled Templates Explanations

Understanding_Reasoning_Types: Specifies the types of reasoning the template involves. Attribute, Spatial, Numerical, or any combination of these.
Reasoning: Indicating whether the template requires reasoning to be correctly completed.
Difficulty: An integer from 1 (easy) to 3 (hard), reflecting the complexity of the template.
Template_Type: The category of the template. Options include domain_term (terms from DaTikZ V3), math_function, or chart.
Group: High-level domain category of the term, such as CS, Math, Biology, or Computer Science.
Subgroup: More specific classification within the selected group such as Computational Geometry for Computer Science.
Template: The original template with placeholders, used for inserting selected terms.
Chosen_Terms: The specific terms selected by the LLM (GPT-4o) to fill into the template.
Filled_Template: The initial version of the template after term insertion, generated by GPT-4o.
Corrected_Template: A revised and improved version of the filled template, also generated by GPT-4o.
Evaluated_Template: Binary evaluation indicating whether the corrected template is acceptable (1 = good, 0 = still problematic).

Recommendation for Use

For downstream use, prefer the Corrected_Template over the Filled_Template. Evaluated_Template = 1 ensures you are more likely to work with corrected templates that are visualizable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScImage V2 Dataset

Introduction

Directory Overview

Filled Templates Explanations

Recommendation for Use

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Chart_Types		Chart_Types
Filled_Templates		Filled_Templates
Groupings		Groupings
Human_Evals		Human_Evals
Plots		Plots
Python_Code_1000		Python_Code_1000
Python_Images_291		Python_Images_291
ScImage_V1		ScImage_V1
Scripts		Scripts
Templates		Templates
TikZ_Code_1000		TikZ_Code_1000
TikZ_Images_291		TikZ_Images_291
.gitignore		.gitignore
README.md		README.md
ScImage_V2_Paper.pdf		ScImage_V2_Paper.pdf
ScImage_V2_Presentation.pptx		ScImage_V2_Presentation.pptx

Folders and files

Latest commit

History

Repository files navigation

ScImage V2 Dataset

Introduction

Directory Overview

Filled Templates Explanations

Recommendation for Use

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages