-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
[MRG] Change deprecation for min_impurity_split from removal to changing the default #12400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Change deprecation for min_impurity_split from removal to changing the default #12400
Conversation
about the min_impurity_split
I don't think we can adopt the same approach as with tol, since this could
change the fit of every tree without the user's ability to do anything
about it too make it consistent across versions. I fear we may need to go
back and do the deprecation properly.
|
The user will still not be able to reproduce the old behavior but I guess at least they'll have a year or so to fix it. |
The message says to set it to -inf to avoid both warnings, but that means the code will break in 0.21 :/ |
Or do we add two more cycles, one changing the default to 0 and one
deprecating the parameter?
That is what I was thinking
|
I guess that's the cleanest way. @NicolasHug clear what to do? For now we basically change the warning saying that the parameter is deprecated, will change to |
Is there a difference between 0 and -inf? |
Ok got it.
Any negative value would raise an error anyway (except in my current changes which I will revert) |
+ docstring updates
sklearn/ensemble/forest.py
Outdated
Threshold for early stopping in tree growth. A node will split | ||
if its impurity is above the threshold, otherwise it is a leaf. | ||
|
||
.. deprecated:: 0.19 | ||
``min_impurity_split`` has been deprecated in favor of | ||
``min_impurity_decrease`` in 0.19 and will be removed in 0.21. | ||
Use ``min_impurity_decrease`` instead. | ||
``min_impurity_decrease`` in 0.19 and will be removed in 0.25. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say this in the order
``min_impurity_split`` has been deprecated in favor of ``min_impurity_decrease`` in 0.19. The default value of ``min_impurity_split`` will change from 1e-7 to 0 in 0.23 and will be removed in 0.25. Use ``min_impurity_decrease`` instead.
sklearn/tree/tree.py
Outdated
" will be removed in version 0.21. " | ||
warnings.warn("The min_impurity_split parameter is deprecated. " | ||
"Its default value will change from 1e-7 to 0 in " | ||
"version 0.23, and will be removed in 0.25. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"it will be removed"?
slight change suggested in the wording, otherwise looks good :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
Can you please rename the PR to something more appropriate for the commit logs, @NicolasHug? |
@jnothman better? |
Yes, you can change that when doing squash and merge, but I had writer's block |
Thanks @NicolasHug |
… changing the default (scikit-learn#12400)" This reverts commit f3602ca.
… changing the default (scikit-learn#12400)" This reverts commit f3602ca.
Reference Issues/PRs
Addresses #12240 (comment)
What does this implement/fix? Explain your changes.
min_impurity_split
is deprecated but still defaults to 1e-7 when not set, even if the user setsmin_impurity_decrease
.This PR adds a ChangedBehaviorWarning indicating results may change whenmin_impurity_split
is actually removed.Any other comments?