Add .yaml config for `QuiltVQA` #919

nkaenzig · 2025-10-16T12:59:05Z

Closes #908

Building up on #912 and #914, this PR adds a a fully functional evaluation config for QuiltVQA. This is the first free-form visual question answering task in eva - using a G-Eval LLM Judge as metric.

How to test?

export GEMINI_API_KEY=<your-secret>
DATA_ROOT=/path/to/download/dataset/to DOWNLOAD_DATA=true MODEL_NAME=google/gemini-2.5-flash-lite eva test --config configs/multimodal/pathology/online/free_form/quilt_vqa.yaml

MaxFeucht

LGTM, only minor comments

configs/multimodal/pathology/offline/free_form/quilt_vqa.yaml

src/eva/core/data/samplers/classification/balanced.py

src/eva/language/metrics/llm_judge/base.py

src/eva/language/metrics/llm_judge/g_eval/judge.py

…l-config

nkaenzig added 4 commits October 16, 2025 09:38

add configs/multimodal/pathology/online/multiple_choice/quilt_vqa.yaml

096289c

add arrow files to LFS

af7da3e

fix unit tests

c8e7adc

add missing_limit to GEvalCorrectness

1af4179

nkaenzig self-assigned this Oct 16, 2025

nkaenzig marked this pull request as ready for review October 16, 2025 12:59

nkaenzig added 4 commits October 16, 2025 15:14

add google/gemini-2.5-flash-lite model to multimodal

7076128

add _raise_if_missing to GEvalCorrectness

52afe9f

fix unit test

c88dd21

update with main

a63df51

MaxFeucht approved these changes Oct 21, 2025

View reviewed changes

nkaenzig added 3 commits October 22, 2025 08:52

Merge remote-tracking branch 'origin/main' into 908-add-quilt-vqa-yam…

c38c5b6

…l-config

expose sample_ratio in yaml config

644579d

added comment to docstring when num_samples is None

06ec524

nkaenzig enabled auto-merge (squash) October 22, 2025 07:04

nkaenzig merged commit cd1c6d0 into main Oct 22, 2025
7 checks passed

nkaenzig deleted the 908-add-quilt-vqa-yaml-config branch October 22, 2025 07:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add .yaml config for `QuiltVQA` #919

Add .yaml config for `QuiltVQA` #919

Uh oh!

nkaenzig commented Oct 16, 2025 •

edited

Loading

Uh oh!

MaxFeucht left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add .yaml config for QuiltVQA #919

Add .yaml config for QuiltVQA #919

Uh oh!

Conversation

nkaenzig commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to test?

Uh oh!

MaxFeucht left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add .yaml config for `QuiltVQA` #919

Add .yaml config for `QuiltVQA` #919

nkaenzig commented Oct 16, 2025 •

edited

Loading