Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@hommayushi3
Copy link
Contributor

@hommayushi3 hommayushi3 commented Apr 9, 2025

This pull request includes several changes to the train function and related methods in the oumi module to support additional model and trainer keyword arguments. The motivation was to allow passing non-serializable keyword arguments into the trainer class (and the model class). The most important changes are as follows:

Enhancements to train function:

  • Modified the train function in src/oumi/__init__.py to accept additional_model_kwargs and additional_trainer_kwargs parameters and pass them to the oumi.train.train function.

Enhancements to train function in src/oumi/train.py:

  • Updated the _create_optional_training_kwargs function to accept an additional_trainer_kwargs parameter and include it in the returned dictionary. [1] [2]
  • Modified the train function to accept additional_model_kwargs and additional_trainer_kwargs parameters and use them when building the model and creating optional training kwargs. [1] [2]

Related issues

Fixes #1623 by allowing user to pass a preprocess_logits_for_metrics function to additional_trainer_kwargs in the train function.

Before submitting

  • This PR only changes documentation. (You can ignore the following checks in that case)
  • Did you read the contributor guideline Pull Request guidelines?
  • Did you link the issue(s) related to this PR in the section above?
  • Did you add / update tests where needed?

Reviewers

At least one review from a member of oumi-ai/oumi-staff is required.

@wizeng23 wizeng23 self-requested a review April 9, 2025 19:52
Copy link
Contributor

@wizeng23 wizeng23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this PR looks good to me. We aren't using the kwargs param of the train() function currently in our core codebase, so I don't think this should affect existing code. Let me loop in a second opinion for review as well.

@wizeng23 wizeng23 requested a review from oelachqar April 9, 2025 20:02
@wizeng23
Copy link
Contributor

wizeng23 commented Apr 9, 2025

GPU test failure can be ignored; flaky LM Harness test

Copy link
Contributor

@oelachqar oelachqar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Thank you for the contribution!

Copy link
Contributor

@wizeng23 wizeng23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@wizeng23 wizeng23 merged commit 124b2fd into oumi-ai:main Apr 9, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Enable custom metrics that require preprocess_logits_for_metrics in SFTTrainer

3 participants