Add additional_model_kwargs and additional_trainer_kwargs to train function #1624

hommayushi3 · 2025-04-09T02:28:52Z

This pull request includes several changes to the train function and related methods in the oumi module to support additional model and trainer keyword arguments. The motivation was to allow passing non-serializable keyword arguments into the trainer class (and the model class). The most important changes are as follows:

Enhancements to `train` function:

Modified the train function in src/oumi/__init__.py to accept additional_model_kwargs and additional_trainer_kwargs parameters and pass them to the oumi.train.train function.

Enhancements to `train` function in `src/oumi/train.py`:

Updated the _create_optional_training_kwargs function to accept an additional_trainer_kwargs parameter and include it in the returned dictionary. [1] [2]
Modified the train function to accept additional_model_kwargs and additional_trainer_kwargs parameters and use them when building the model and creating optional training kwargs. [1] [2]

Related issues

Fixes #1623 by allowing user to pass a preprocess_logits_for_metrics function to additional_trainer_kwargs in the train function.

Before submitting

This PR only changes documentation. (You can ignore the following checks in that case)
Did you read the contributor guideline Pull Request guidelines?
Did you link the issue(s) related to this PR in the section above?
Did you add / update tests where needed?

Reviewers

At least one review from a member of oumi-ai/oumi-staff is required.

…nction

wizeng23

Overall, this PR looks good to me. We aren't using the kwargs param of the train() function currently in our core codebase, so I don't think this should affect existing code. Let me loop in a second opinion for review as well.

wizeng23 · 2025-04-09T20:07:00Z

GPU test failure can be ignored; flaky LM Harness test

oelachqar

LGTM - Thank you for the contribution!

wizeng23

Thank you!

…nction (#1624)

Add additional_model_kwargs and additional_trainer_kwargs to train fu…

53b0794

…nction

wizeng23 self-requested a review April 9, 2025 19:52

wizeng23 reviewed Apr 9, 2025

View reviewed changes

wizeng23 requested a review from oelachqar April 9, 2025 20:02

oelachqar approved these changes Apr 9, 2025

View reviewed changes

wizeng23 approved these changes Apr 9, 2025

View reviewed changes

wizeng23 merged commit 124b2fd into oumi-ai:main Apr 9, 2025
2 checks passed

penfever pushed a commit that referenced this pull request Aug 27, 2025

Add additional_model_kwargs and additional_trainer_kwargs to train fu…

2efaa58

…nction (#1624)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add additional_model_kwargs and additional_trainer_kwargs to train function #1624

Add additional_model_kwargs and additional_trainer_kwargs to train function #1624

hommayushi3 commented Apr 9, 2025 •

edited

Loading

Uh oh!

wizeng23 left a comment

Uh oh!

wizeng23 commented Apr 9, 2025

Uh oh!

oelachqar left a comment

Uh oh!

wizeng23 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add additional_model_kwargs and additional_trainer_kwargs to train function #1624

Add additional_model_kwargs and additional_trainer_kwargs to train function #1624

Conversation

hommayushi3 commented Apr 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Enhancements to train function:

Enhancements to train function in src/oumi/train.py:

Related issues

Before submitting

Reviewers

Uh oh!

wizeng23 left a comment

Choose a reason for hiding this comment

Uh oh!

wizeng23 commented Apr 9, 2025

Uh oh!

oelachqar left a comment

Choose a reason for hiding this comment

Uh oh!

wizeng23 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hommayushi3 commented Apr 9, 2025 •

edited

Loading

Enhancements to `train` function:

Enhancements to `train` function in `src/oumi/train.py`: