Thanks to visit codestin.com
Credit goes to github.com

Skip to content

MAINT Use check_scalar in BaseGradientBoosting #21632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

genvalen
Copy link
Contributor

@genvalen genvalen commented Nov 11, 2021

Reference Issues/PRs

This PR requires #21990 to be merged in first.
Addresses #20724 and #21927
#DataUmbrella

What does this implement/fix? Explain your changes.

Summary of changes to BaseGradientBoosting:

  • Add tests to ensure
    GradientBoostingClassifier and
    GradientBoostingRegressor
    raise proper errors when invalid arguments are passed in.
  • Use the helper function check_scalar from sklearn.utils to validate the scalar parameters.

Test and validation progress:

In both estimators

  • learning_rate
  • n_estimators
  • min_samples_split
  • min_samples_leaf
  • min_weight_fraction_leaf
  • max_depth
  • min_impurity_decrease
  • subsample
  • max_features
  • ccp_alpha
  • verbose
  • max_leaf_nodes
  • warm_start
  • validation_fraction
  • n_iter_no_change
  • tol

In GradientBoostingRegressor

  • alpha

References

  1. check_scalar docs
  2. PR #20723

Any other comments?

For the unchecked tasks, validation is coming from BaseDecisionTree, however, tests have been added for them here.

@genvalen
Copy link
Contributor Author

genvalen commented Dec 13, 2021

Notes for parameter ranges in Trees
(param: type, range)
learning_rate: float, (0.0, inf)
n_estimators: int, [1, inf)
min_samples_split: int -> [2, inf], float-> (0, 1]
min_samples_leaf: int -> [1, inf], float-> (0, 1]
min_weight_fraction_leaf: float-> [0, 0.5]
max_depth: int, if not none, then [1, inf)
min_impurity_decrease: float, [0, inf)
subsample: float, (0, 1]
alpha: float, (0, 1)
max_features: int -> [1, number of features], float -> (0, 1], or string (not checked with check_scalar)
ccp_alpha: float, [0, inf)
verbose: int, [0, inf), or np.bool_
max_leaf_nodes: int, if not none, then [2, inf)
warm_start: check that it is int, (inf, inf) or np.bool_
validation_fraction: float, (0, 1)
n_iter_no_change: int -> [1, inf)
tol: float, (0, inf)

@genvalen
Copy link
Contributor Author

genvalen commented Jan 8, 2022

Hi @glemaitre: just want to note that for the 7 remaining params in the PR task list (and also max_features), the validation is coming from BaseDecisionTree.



I have added tests for these params in BaseGradientBoosting that fail right now, but they should pass once BaseDecisionTree #21990 gets merged in. (they are commented out for now)

For these remaining params, please let me know if it would be helpful to include the validation explicitly within BaseGradientBoosting, too. Thank you!

@genvalen genvalen changed the title [WIP] MAINT Use check_scalar in BaseGradientBoosting [MRG] MAINT Use check_scalar in BaseGradientBoosting Jan 31, 2022
@genvalen genvalen marked this pull request as ready for review January 31, 2022 05:52
@genvalen
Copy link
Contributor Author

Hi @glemaitre and @ogrisel, please review this at you convenience. Thank you!

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Co-authored-by: Guillaume Lemaitre <[email protected]>
Comment on lines 375 to 377
max_val=self.n_features_in_,
include_boundaries="both",
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
max_val=self.n_features_in_,
include_boundaries="both",
)
include_boundaries="left",
)

@thomasjpfan thomasjpfan changed the title [MRG] MAINT Use check_scalar in BaseGradientBoosting MAINT Use check_scalar in BaseGradientBoosting Feb 7, 2022
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thomasjpfan thomasjpfan merged commit b80138f into scikit-learn:main Feb 7, 2022
glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Feb 9, 2022
thomasjpfan added a commit to thomasjpfan/scikit-learn that referenced this pull request Mar 1, 2022
@genvalen genvalen deleted the BaseGradientBoosting_add_check_scalar branch March 10, 2023 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants