[vision] Add option to process images individually #1706

oelachqar · 2025-06-02T21:51:30Z

Description

Add option to process images individually:
- Some processors (e.g. Molmo) do not support processing a batch of images
- This PR adds an option to process each image individually instead
Also add more docstrings / explanations of each option

Related issues

Fixes # (issue)

Before submitting

This PR only changes documentation. (You can ignore the following checks in that case)
Did you read the contributor guideline Pull Request guidelines?
Did you link the issue(s) related to this PR in the section above?
Did you add / update tests where needed?

Reviewers

At least one review from a member of oumi-ai/oumi-staff is required.

src/oumi/core/collators/vision_language_sft_collator.py

taenin

Would love to see tests added, but LGTM as they don't exist yet.

nikg4 · 2025-06-02T22:34:19Z

src/oumi/core/collators/vision_language_sft_collator.py

+            max_var_dims = 2 if self._allow_multi_image_inputs else 1
+
+            # Pad and stack tensors
+            collated[key] = pad_to_max_dim_and_stack(


note that this simple padding logic isn't compatible with all VLM models. Consider adding a note about it

(delegating everything to processor was the original motivation for adding this collator)

update

23ec974

oelachqar requested review from jgreer013, nikg4, taenin and wizeng23 June 2, 2025 21:51

wizeng23 approved these changes Jun 2, 2025

View reviewed changes

taenin reviewed Jun 2, 2025

View reviewed changes

src/oumi/core/collators/vision_language_sft_collator.py Outdated Show resolved Hide resolved

taenin approved these changes Jun 2, 2025

View reviewed changes

oelachqar added 2 commits June 2, 2025 15:31

Merge branch 'main' into oelachqar/process_individually

81cb323

update

8652174

nikg4 reviewed Jun 2, 2025

View reviewed changes

nikg4 approved these changes Jun 2, 2025

View reviewed changes

oelachqar merged commit 828466a into main Jun 2, 2025
5 checks passed

oelachqar deleted the oelachqar/process_individually branch June 2, 2025 23:37

penfever pushed a commit that referenced this pull request Aug 27, 2025

[vision] Add option to process images individually (#1706)

dea3c8c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[vision] Add option to process images individually #1706

[vision] Add option to process images individually #1706

Uh oh!

oelachqar commented Jun 2, 2025

Uh oh!

Uh oh!

taenin left a comment

Uh oh!

nikg4 Jun 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[vision] Add option to process images individually #1706

[vision] Add option to process images individually #1706

Uh oh!

Conversation

oelachqar commented Jun 2, 2025

Description

Related issues

Before submitting

Reviewers

Uh oh!

Uh oh!

taenin left a comment

Choose a reason for hiding this comment

Uh oh!

nikg4 Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nikg4 Jun 2, 2025 •

edited

Loading