Thanks to visit codestin.com
Credit goes to github.com

Skip to content

TST Fix doctest due to GradientBoostingClassifier difference with scipy 1.15 #30583

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 6, 2025

Conversation

lesteve
Copy link
Member

@lesteve lesteve commented Jan 4, 2025

Follow up of #30495.

Fix failure seen in https://dev.azure.com/scikit-learn/scikit-learn/_build/results?buildId=73289&view=results.

Not sure why there was not an automatically opened issue to be honest but 🤷. Edit: the automated issue is for test failures, rst file doctest are run separately ...

Probably a change with numpy-dev where result is:

0.45499999999999996

where as it was 0.46... before

Another case where #30496 would be useful 😉

Copy link

github-actions bot commented Jan 4, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: c043775. Link to the linter CI: here

@lesteve lesteve added the Quick Review For PRs that are quick to review label Jan 4, 2025
Copy link
Member

@virchan virchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for fixing that @lesteve!

@virchan virchan added the Waiting for Second Reviewer First reviewer is done, need a second one! label Jan 4, 2025
@lesteve lesteve changed the title TST Fix doctest due to floating point difference in numpy dev TST Fix doctest due to difference in scipy 1.15 Jan 6, 2025
@lesteve
Copy link
Member Author

lesteve commented Jan 6, 2025

In the end I ended up using HistGradientBoostingClassifier which seems more stable. Also I think we should slightly nudge people towards HistGradientBoostingClassifier in our doc.

For the record I looked a bit more at this because the difference seemed sizeable enough 0.4549 (3rd digit is 4) vs 0.4599 (third digit is 9). This seems actually related to scipy 1.15 and only with the default loss (log_loss) and not exponential. I am going to wild-guess a small change in scipy.special.xlogy. The scipy 1.15 changelog does mention a few accuracy improvements in scipy.special (although not explicitly xlogy):

It seems like there is a small floating point difference in the gradient-boosting loss, that somehow can turn into a sizable difference in terms of score when you are unlucky (0.62 vs 0.68 for example in the snippet below). In the snippet below, the score difference starts happening with n_estimators=96 (at least on my machine). I guess this snippet has no relationship between X and y so this is probably why such a small floating point loss can turn into a sizeable score difference?

I don't think this is worth bothering too much about it. Here is a quick and dirty snippet where I debugged the difference in case someone wants to have a closer look.

# %%
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.feature_selection import SelectKBest
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import cross_validate
from sklearn.datasets import make_regression
import joblib

import numpy as np
n_samples, n_features, n_classes = 200, 100, 2
rng = np.random.RandomState(42)
X = rng.standard_normal((n_samples, n_features))
y = rng.choice(n_classes, n_samples)

# somehow score differs starting in n_estimators=96 ... but trees differs starting in n_estimators = 2 (first one is identical second one is not). HistGradientBoostingClassifier does not show the same thing
# only happens with log_loss default (not exponential) so maybe scipy xlogy change
clf = GradientBoostingClassifier(random_state=1, n_estimators=100, verbose=100)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

clf.fit(X_train, y_train)
print(clf.score(X_test, y_test))
# %%
from pprint import pprint
pprint([joblib.hash(est) for est in clf.estimators_][:5])

# %%
from sklearn.tree import export_text

second_tree = clf.estimators_[1][0]
print(joblib.hash(second_tree.tree_))
print(export_text(second_tree, decimals=10, max_depth=100))
print(joblib.hash(second_tree.tree_.impurity))

Some key findings:

  • the first tree is identical for scipy 1.14.1 and scipy 1.15.0
  • the second tree is not identical but only differs wrt impurity all the rest seems exactly the same
  • the loss given by the GradientBoosting is slightly different (16th decimal)
scipy 1.14.1 output (score=0.62)
      Iter       Train Loss   Remaining Time 
         1 1.3031346873888148            0.34s
         2 1.2332030230874167            0.33s
         3 1.1717203225643884            0.33s
         4 1.1174320668883959            0.32s
         5 1.0548084758306693            0.31s
         6 0.9993879773505959            0.31s
         7 0.9580575095841599            0.30s
         8 0.9222968536421369            0.30s
         9 0.8858591714448600            0.30s
        10 0.8596018959266511            0.29s
        11 0.8234610031593187            0.29s
        12 0.7947498673644795            0.29s
        13 0.7653627222664898            0.29s
        14 0.7450631249903237            0.28s
        15 0.7162280716859813            0.28s
        16 0.6982154171604616            0.28s
        17 0.6705623402750668            0.27s
        18 0.6456791153472771            0.27s
        19 0.6280360014972848            0.27s
        20 0.6131350294776663            0.26s
        21 0.5992676733431259            0.26s
        22 0.5864773341011283            0.26s
        23 0.5729445833584860            0.25s
        24 0.5599316328546042            0.25s
        25 0.5453192111906160            0.25s
        26 0.5265456014234711            0.24s
        27 0.5149522069244638            0.24s
        28 0.4974834548586201            0.24s
        29 0.4780747556443493            0.23s
        30 0.4542496678981596            0.23s
        31 0.4409272164918828            0.23s
        32 0.4304938068980165            0.22s
        33 0.4138141177768845            0.22s
        34 0.4041484331213010            0.22s
        35 0.3959664034818218            0.21s
        36 0.3824560174382272            0.21s
        37 0.3716273989909005            0.21s
        38 0.3537127789444641            0.20s
        39 0.3455604430843203            0.20s
        40 0.3279765932771198            0.20s
        41 0.3172722762037838            0.19s
        42 0.3029701106285911            0.19s
        43 0.2922062908323000            0.19s
        44 0.2818142701171290            0.18s
        45 0.2759644986996963            0.18s
        46 0.2696230554982443            0.18s
        47 0.2632473125959979            0.17s
        48 0.2591024139756998            0.17s
        49 0.2528705435709728            0.17s
        50 0.2490297967420408            0.16s
        51 0.2436309333997958            0.16s
        52 0.2349376221577026            0.16s
        53 0.2273574589035918            0.15s
        54 0.2187492682653094            0.15s
        55 0.2143729661040189            0.15s
        56 0.2086445927810001            0.14s
        57 0.2041758258140417            0.14s
        58 0.1954509924207041            0.14s
        59 0.1904469561666429            0.13s
        60 0.1843991363310482            0.13s
        61 0.1799388041164273            0.13s
        62 0.1768572810284569            0.12s
        63 0.1737661078149119            0.12s
        64 0.1683929049157707            0.12s
        65 0.1627210531343629            0.11s
        66 0.1586436192699800            0.11s
        67 0.1539065254162819            0.11s
        68 0.1472677467883127            0.10s
        69 0.1412038272392375            0.10s
        70 0.1387907124065169            0.10s
        71 0.1345233348218365            0.09s
        72 0.1317032579171560            0.09s
        73 0.1298088310479556            0.09s
        74 0.1270377115423700            0.09s
        75 0.1234247271501200            0.08s
        76 0.1210321340293562            0.08s
        77 0.1174391792576234            0.08s
        78 0.1154998417708764            0.07s
        79 0.1134477449894605            0.07s
        80 0.1106352683849604            0.07s
        81 0.1068774561733941            0.06s
        82 0.1028151666176506            0.06s
        83 0.0990027619786546            0.06s
        84 0.0974211154325865            0.05s
        85 0.0955512284712263            0.05s
        86 0.0936564891257705            0.05s
        87 0.0911606526751379            0.04s
        88 0.0882395993745493            0.04s
        89 0.0857333070910218            0.04s
        90 0.0838003887236491            0.03s
        91 0.0809730187902697            0.03s
        92 0.0788725657796014            0.03s
        93 0.0767814499378289            0.02s
        94 0.0752604578961001            0.02s
        95 0.0737654081611234            0.02s
        96 0.0728320763418284            0.01s
        97 0.0712910470801800            0.01s
        98 0.0698062904968691            0.01s
        99 0.0684395024180467            0.00s
       100 0.0674112369464306            0.00s
0.62
['a385acb3f9390049de75e71077914420',
 '8cb5fa5b081d2659e7c14cc3552bf01c',
 '4d4217fbbaaa5c0e59f5d608d13b26ae',
 '50cee025c1aa47c179e305c3b4161898',
 '73bfc1da8bf179517244b8e49832c7fe']
d56b9e0586a4c5d348a53fd3f37fb6d8
|--- feature_49 <= -0.7416785955
|   |--- feature_62 <= -1.1938117146
|   |   |--- feature_74 <= -0.7086514533
|   |   |   |--- value: [1.7254004784]
|   |   |--- feature_74 >  -0.7086514533
|   |   |   |--- value: [-1.9224202823]
|   |--- feature_62 >  -1.1938117146
|   |   |--- feature_91 <= 1.3751169443
|   |   |   |--- value: [1.7254004784]
|   |   |--- feature_91 >  1.3751169443
|   |   |   |--- value: [-1.9224202823]
|--- feature_49 >  -0.7416785955
|   |--- feature_82 <= -0.3977906704
|   |   |--- feature_90 <= -0.2772663385
|   |   |   |--- value: [0.2673611228]
|   |   |--- feature_90 >  -0.2772663385
|   |   |   |--- value: [-1.6460506660]
|   |--- feature_82 >  -0.3977906704
|   |   |--- feature_98 <= -0.9155445397
|   |   |   |--- value: [-1.4461533807]
|   |   |--- feature_98 >  -0.9155445397
|   |   |   |--- value: [0.4797226668]

d7a3ac02bfb1ab2ba976e0ee57880771
scipy 1.15.0 output (score=0.68)
      Iter       Train Loss   Remaining Time 
         1 1.3031346873888148            0.34s
         2 1.2332030230874169            0.33s
         3 1.1717203225643884            0.32s
         4 1.1174320668883959            0.32s
         5 1.0548084758306693            0.31s
         6 0.9993879773505959            0.31s
         7 0.9580575095841599            0.30s
         8 0.9222968536421369            0.30s
         9 0.8858591714448600            0.30s
        10 0.8596018959266511            0.29s
        11 0.8234610031593187            0.29s
        12 0.7947498673644795            0.29s
        13 0.7653627222664898            0.28s
        14 0.7450631249903237            0.28s
        15 0.7162280716859813            0.28s
        16 0.6982154171604616            0.27s
        17 0.6705623402750668            0.27s
        18 0.6456791153472771            0.27s
        19 0.6280360014972848            0.26s
        20 0.6131350294776663            0.26s
        21 0.5992676733431259            0.26s
        22 0.5864773341011283            0.26s
        23 0.5729445833584860            0.25s
        24 0.5599316328546042            0.25s
        25 0.5453192111906160            0.25s
        26 0.5265456014234711            0.24s
        27 0.5149522069244639            0.24s
        28 0.4974834548586201            0.24s
        29 0.4780747556443493            0.23s
        30 0.4542496678981596            0.23s
        31 0.4409272164918828            0.23s
        32 0.4304938068980165            0.22s
        33 0.4138141177768845            0.22s
        34 0.4041484331213010            0.22s
        35 0.3959664034818218            0.21s
        36 0.3824560174382272            0.21s
        37 0.3716273989909005            0.21s
        38 0.3537127789444641            0.20s
        39 0.3455604430843203            0.20s
        40 0.3279765932771197            0.20s
        41 0.3172722762037838            0.19s
        42 0.3029701106285911            0.19s
        43 0.2922062908323000            0.19s
        44 0.2818142701171290            0.18s
        45 0.2759644986996963            0.18s
        46 0.2696230554982443            0.18s
        47 0.2632473125959979            0.17s
        48 0.2591024139756998            0.17s
        49 0.2528705435709728            0.17s
        50 0.2490297967420408            0.16s
        51 0.2436309333997958            0.16s
        52 0.2349376221577026            0.16s
        53 0.2273574589035917            0.15s
        54 0.2187492682653094            0.15s
        55 0.2143729661040189            0.15s
        56 0.2086445927810001            0.14s
        57 0.2041758258140417            0.14s
        58 0.1954509924207041            0.14s
        59 0.1904469561666429            0.13s
        60 0.1843991363310482            0.13s
        61 0.1799388041164273            0.13s
        62 0.1768572810284570            0.12s
        63 0.1737661078149119            0.12s
        64 0.1683929049157708            0.12s
        65 0.1627210531343629            0.11s
        66 0.1586436192699800            0.11s
        67 0.1539065254162819            0.11s
        68 0.1472677467883127            0.10s
        69 0.1412038272392375            0.10s
        70 0.1387907124065169            0.10s
        71 0.1345233348218365            0.09s
        72 0.1317032579171560            0.09s
        73 0.1298088310479556            0.09s
        74 0.1270377115423700            0.09s
        75 0.1234247271501200            0.08s
        76 0.1210321340293562            0.08s
        77 0.1174391792576234            0.08s
        78 0.1154998417708764            0.07s
        79 0.1134477449894605            0.07s
        80 0.1106352683849605            0.07s
        81 0.1068774561733941            0.06s
        82 0.1028151666176506            0.06s
        83 0.0990027619786546            0.06s
        84 0.0974211154325865            0.05s
        85 0.0955512284712264            0.05s
        86 0.0936564891257705            0.05s
        87 0.0911606526751379            0.04s
        88 0.0882395993745493            0.04s
        89 0.0857333070910218            0.04s
        90 0.0838003887236491            0.03s
        91 0.0809730187902697            0.03s
        92 0.0788725657796014            0.03s
        93 0.0767814499378289            0.02s
        94 0.0752604578961001            0.02s
        95 0.0737654081611233            0.02s
        96 0.0728320763418284            0.01s
        97 0.0712910470801800            0.01s
        98 0.0698062904968692            0.01s
        99 0.0684395024180466            0.00s
       100 0.0674112369464307            0.00s
0.68
['a385acb3f9390049de75e71077914420',
 'dcb4ef398a4ee30046e4ff6cb1cbaf83',
 '3447c6b87baddad6c584b5dfd226937c',
 '529d320b847ad1b468f64e4d424551e2',
 'cdc7564d89cfc829773998771432f655']
a5b1fc22ea2650b79f6286b016c5e6b9
|--- feature_49 <= -0.7416785955
|   |--- feature_62 <= -1.1938117146
|   |   |--- feature_74 <= -0.7086514533
|   |   |   |--- value: [1.7254004784]
|   |   |--- feature_74 >  -0.7086514533
|   |   |   |--- value: [-1.9224202823]
|   |--- feature_62 >  -1.1938117146
|   |   |--- feature_91 <= 1.3751169443
|   |   |   |--- value: [1.7254004784]
|   |   |--- feature_91 >  1.3751169443
|   |   |   |--- value: [-1.9224202823]
|--- feature_49 >  -0.7416785955
|   |--- feature_82 <= -0.3977906704
|   |   |--- feature_90 <= -0.2772663385
|   |   |   |--- value: [0.2673611228]
|   |   |--- feature_90 >  -0.2772663385
|   |   |   |--- value: [-1.6460506660]
|   |--- feature_82 >  -0.3977906704
|   |   |--- feature_98 <= -0.9155445397
|   |   |   |--- value: [-1.4461533807]
|   |   |--- feature_98 >  -0.9155445397
|   |   |   |--- value: [0.4797226668]

11a0d7df094a01cd1d23bc34a5fc00bc

I had a patch to increase the digits shown by gradient-boosting verbose:

diff --git a/sklearn/ensemble/_gb.py b/sklearn/ensemble/_gb.py
index fded8a5354..2daec967d8 100644
--- a/sklearn/ensemble/_gb.py
+++ b/sklearn/ensemble/_gb.py
@@ -298,7 +298,7 @@ class VerboseReporter:
         """
         # header fields and line format str
         header_fields = ["Iter", "Train Loss"]
-        verbose_fmt = ["{iter:>10d}", "{train_score:>16.4f}"]
+        verbose_fmt = ["{iter:>10d}", "{train_score:>16.16f}"]
         # do oob?
         if est.subsample < 1:
             header_fields.append("OOB Improve")

@lesteve lesteve changed the title TST Fix doctest due to difference in scipy 1.15 TST Fix doctest due to GradientBoostingClassifier difference with scipy 1.15 Jan 6, 2025
@jeremiedbb
Copy link
Member

  • +1 to use HistGradientBoosting.

  • It's not very surprising that a small floating point difference at the beginning turns into a significant one at the end, especially when there are hard thresholding cuts in the middle, so I agree with you that we shouldn't be bothered too much by this change.

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jeremiedbb jeremiedbb merged commit 2605370 into scikit-learn:main Jan 6, 2025
34 checks passed
@lesteve lesteve deleted the fix-doctest branch January 6, 2025 13:39
@lesteve
Copy link
Member Author

lesteve commented Jan 6, 2025

It's not very surprising that a small floating point difference at the beginning turns into a significant one at the end, especially when there are hard thresholding cuts in the middle, so I agree with you that we shouldn't be bothered too much by this change.

This is the kind of reassuring thing I wanted to hear, we should do a meme about it:

I am @jeremiedbb and I approve this floating point difference 😉

@lesteve
Copy link
Member Author

lesteve commented Jan 6, 2025

There is still a bit of a mystery though, but I am willing to live with it like:

  • why is HistGradientBoostingClassifier more stable
  • why is there no difference until n_estimators=95 and then suddently from n_estimators=96 onwards there is a difference (OK I tried only a few values but this looks like this was the general behaviour ...)

I still think that part of the reason is that X and y are unrelated, which makes the problem kind of sensitive to numerical noise.

@ogrisel
Copy link
Member

ogrisel commented Jan 6, 2025

why is HistGradientBoostingClassifier more stable

Maybe because the splits in hist gradient boosting do not involve the computation of a floating-point threshold value (but instead used discretized integer values that are computed a priori).

why is there no difference until n_estimators=95 and then suddently from n_estimators=96 onwards there is a difference (OK I tried only a few values but this looks like this was the general behaviour ...)

Maybe this is because gradient boosting is sequentially fitting models on the residuals of the predictions of the previous models, so later iterations are more susceptible to be impacted by small changes in the computation of the residuals.

I still think that part of the reason is that X and y are unrelated, which makes the problem kind of sensitive to numerical noise.

It could be: there is little signal to extract from the training set before reaching many potential tree splits that are all approximately equally (not) informative.

@lesteve
Copy link
Member Author

lesteve commented Jan 6, 2025

Thanks for the ML insights @ogrisel! This makes sense, in particular I did not think the binning could be a stabilizing factor.

jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request Jan 8, 2025
jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Quick Review For PRs that are quick to review Waiting for Second Reviewer First reviewer is done, need a second one!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants