-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
DOC GradientBoosting*
will not implement monotonic constraints, use HistGradientBoosting*
instead
#27516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC GradientBoosting*
will not implement monotonic constraints, use HistGradientBoosting*
instead
#27516
Conversation
Before to review this PR, I would like first to fix the constrain in In case we don't make it for the 1.4 release, we would need to revert more than a single commit which would make it complex. But right after that, I will be reviewing this PR. |
#27639 is merged. So we could resume to this PR. As to the tests, we should have almost the same as in sklearn/ensemble/_hist_gradient_boosting/tests/test_monotonic_contraints.py. |
@lorentzenchr Thanks for the feedback and I updated the tests. It seems that Please let me know if I misunderstood what you meant. |
AFAIU, line search must be performed with the addition that node values (after ls) trespassing the monotonicity boundary must be set to the boundary value. |
@lorentzenchr Sorry that I forgot about this PR.
I think I understand the theory but I'm not sure if I'm implementing the right way 😢 The boundary values does not seem to be accessible from the tree instance in the line search step (i.e., |
@Charlie-XIAO This feature is trickier than anticipated. Modifying
|
Sorry for the late reply @lorentzenchr. I don't see an intuitive way to achieve Option 1, again because we cannot access the tree-building process in GBT (it calls Update: Maybe something like in 3ac75ea? |
@Charlie-XIAO Modifying the tree does also not work, because we know the "line search" values only after having fit the tree. And only those line search values are the ones that count (except for squared error). Could we use the constraints of the decision trees and then during the line search, check that we don't violate the constraints? If that doesn't work either, it looks bad for this feature. @adrinjalali @thomasjpfan @NicolasHug @ogrisel Do you see possible solutions? |
Interesting PR. Taking a step back, I'm wondering, don't we want to basically make HGBT the sort of "default" for GBs? Wouldn't that make sense to only have this for HGBTs and not have it for GBs? |
I'm +1 on having monotonic constraints only in HGBT and not the regular GB. |
In that case should we just maybe add a note in the docstring recommending HGBT for monotonic constraints? |
That makes sense to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
GradientBoosting*
GradientBoosting*
will not implement monotonic constraints, use HistGradientBoosting*
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, otherwise LGTM.
Fixed :) |
… `HistGradientBoosting*` instead (scikit-learn#27516)
… `HistGradientBoosting*` instead (#27516)
Reference Issues/PRs
Closes #27305.
What does this implement/fix? Explain your changes.
This PR implements monotonicity constraints for
GradientBoostingClassifier
andGradientBoostingRegressor
. This is dropped from #13649.Any other comments?
For your reference: Greedy Function Approximation, Friedman. There were discussions around whether line search should be performed when using monotonic constraints, see #13649 (comment). I actually did not fully understand this so it would be nice if someone can explain in more details. By the way,
test_monotonic_constraints_classifications
insklearn/tree/tests/test_monotonic_tree.py
would fail if line search is performed.Speaking of tests, I'm also a bit confused where they should be placed. It seems that we should have similar (if not the same) tests as
sklearn/tree/tests/test_monotonic_tree.py
so I currently only extended the parametrizations to includeGradientBoostingClassifier
andGradientBoostingRegressor
. Still, it's a bit strange to test one module under another. Please correct me if this is wrong.@lorentzenchr Would you want to take a look? I'm not sure if this is what the target issue desired.