-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
TST Fix doctest due to GradientBoostingClassifier difference with scipy 1.15 #30583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for fixing that @lesteve!
… seems more stable for some reason
In the end I ended up using For the record I looked a bit more at this because the difference seemed sizeable enough It seems like there is a small floating point difference in the gradient-boosting loss, that somehow can turn into a sizable difference in terms of score when you are unlucky ( I don't think this is worth bothering too much about it. Here is a quick and dirty snippet where I debugged the difference in case someone wants to have a closer look. # %%
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.feature_selection import SelectKBest
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import cross_validate
from sklearn.datasets import make_regression
import joblib
import numpy as np
n_samples, n_features, n_classes = 200, 100, 2
rng = np.random.RandomState(42)
X = rng.standard_normal((n_samples, n_features))
y = rng.choice(n_classes, n_samples)
# somehow score differs starting in n_estimators=96 ... but trees differs starting in n_estimators = 2 (first one is identical second one is not). HistGradientBoostingClassifier does not show the same thing
# only happens with log_loss default (not exponential) so maybe scipy xlogy change
clf = GradientBoostingClassifier(random_state=1, n_estimators=100, verbose=100)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
clf.fit(X_train, y_train)
print(clf.score(X_test, y_test))
# %%
from pprint import pprint
pprint([joblib.hash(est) for est in clf.estimators_][:5])
# %%
from sklearn.tree import export_text
second_tree = clf.estimators_[1][0]
print(joblib.hash(second_tree.tree_))
print(export_text(second_tree, decimals=10, max_depth=100))
print(joblib.hash(second_tree.tree_.impurity)) Some key findings:
scipy 1.14.1 output (score=0.62)
scipy 1.15.0 output (score=0.68)
I had a patch to increase the digits shown by gradient-boosting verbose: diff --git a/sklearn/ensemble/_gb.py b/sklearn/ensemble/_gb.py
index fded8a5354..2daec967d8 100644
--- a/sklearn/ensemble/_gb.py
+++ b/sklearn/ensemble/_gb.py
@@ -298,7 +298,7 @@ class VerboseReporter:
"""
# header fields and line format str
header_fields = ["Iter", "Train Loss"]
- verbose_fmt = ["{iter:>10d}", "{train_score:>16.4f}"]
+ verbose_fmt = ["{iter:>10d}", "{train_score:>16.16f}"]
# do oob?
if est.subsample < 1:
header_fields.append("OOB Improve") |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This is the kind of reassuring thing I wanted to hear, we should do a meme about it:
|
There is still a bit of a mystery though, but I am willing to live with it like:
I still think that part of the reason is that |
Maybe because the splits in hist gradient boosting do not involve the computation of a floating-point threshold value (but instead used discretized integer values that are computed a priori).
Maybe this is because gradient boosting is sequentially fitting models on the residuals of the predictions of the previous models, so later iterations are more susceptible to be impacted by small changes in the computation of the residuals.
It could be: there is little signal to extract from the training set before reaching many potential tree splits that are all approximately equally (not) informative. |
Thanks for the ML insights @ogrisel! This makes sense, in particular I did not think the binning could be a stabilizing factor. |
Follow up of #30495.
Fix failure seen in https://dev.azure.com/scikit-learn/scikit-learn/_build/results?buildId=73289&view=results.
Not sure why there was not an automatically opened issue to be honest but 🤷. Edit: the automated issue is for test failures, rst file doctest are run separately ...
Probably a change with numpy-dev where result is:
where as it was
0.46...
beforeAnother case where #30496 would be useful 😉