Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@oelachqar
Copy link
Contributor

Description

  • Add option to process images individually:
    • Some processors (e.g. Molmo) do not support processing a batch of images
    • This PR adds an option to process each image individually instead
  • Also add more docstrings / explanations of each option

Related issues

Fixes # (issue)

Before submitting

  • This PR only changes documentation. (You can ignore the following checks in that case)
  • Did you read the contributor guideline Pull Request guidelines?
  • Did you link the issue(s) related to this PR in the section above?
  • Did you add / update tests where needed?

Reviewers

At least one review from a member of oumi-ai/oumi-staff is required.

Copy link
Collaborator

@taenin taenin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would love to see tests added, but LGTM as they don't exist yet.

max_var_dims = 2 if self._allow_multi_image_inputs else 1

# Pad and stack tensors
collated[key] = pad_to_max_dim_and_stack(
Copy link
Contributor

@nikg4 nikg4 Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that this simple padding logic isn't compatible with all VLM models. Consider adding a note about it

(delegating everything to processor was the original motivation for adding this collator)

@oelachqar oelachqar merged commit 828466a into main Jun 2, 2025
5 checks passed
@oelachqar oelachqar deleted the oelachqar/process_individually branch June 2, 2025 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants