TST Fix doctest due to GradientBoostingClassifier difference with scipy 1.15 #30583

lesteve · 2025-01-04T11:09:26Z

Follow up of #30495.

Fix failure seen in https://dev.azure.com/scikit-learn/scikit-learn/_build/results?buildId=73289&view=results.

Not sure why there was not an automatically opened issue to be honest but 🤷. Edit: the automated issue is for test failures, rst file doctest are run separately ...

Probably a change with numpy-dev where result is:

0.45499999999999996

where as it was 0.46... before

Another case where #30496 would be useful 😉

github-actions · 2025-01-04T11:10:40Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: c043775. Link to the linter CI: here}

virchan

LGTM! Thanks for fixing that @lesteve!

… seems more stable for some reason

lesteve · 2025-01-06T06:12:22Z

In the end I ended up using HistGradientBoostingClassifier which seems more stable. Also I think we should slightly nudge people towards HistGradientBoostingClassifier in our doc.

For the record I looked a bit more at this because the difference seemed sizeable enough 0.4549 (3rd digit is 4) vs 0.4599 (third digit is 9). This seems actually related to scipy 1.15 and only with the default loss (log_loss) and not exponential. I am going to wild-guess a small change in scipy.special.xlogy. The scipy 1.15 changelog does mention a few accuracy improvements in scipy.special (although not explicitly xlogy):

It seems like there is a small floating point difference in the gradient-boosting loss, that somehow can turn into a sizable difference in terms of score when you are unlucky (0.62 vs 0.68 for example in the snippet below). In the snippet below, the score difference starts happening with n_estimators=96 (at least on my machine). I guess this snippet has no relationship between X and y so this is probably why such a small floating point loss can turn into a sizeable score difference?

I don't think this is worth bothering too much about it. Here is a quick and dirty snippet where I debugged the difference in case someone wants to have a closer look.

# %%
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.feature_selection import SelectKBest
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import cross_validate
from sklearn.datasets import make_regression
import joblib

import numpy as np
n_samples, n_features, n_classes = 200, 100, 2
rng = np.random.RandomState(42)
X = rng.standard_normal((n_samples, n_features))
y = rng.choice(n_classes, n_samples)

# somehow score differs starting in n_estimators=96 ... but trees differs starting in n_estimators = 2 (first one is identical second one is not). HistGradientBoostingClassifier does not show the same thing
# only happens with log_loss default (not exponential) so maybe scipy xlogy change
clf = GradientBoostingClassifier(random_state=1, n_estimators=100, verbose=100)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

clf.fit(X_train, y_train)
print(clf.score(X_test, y_test))
# %%
from pprint import pprint
pprint([joblib.hash(est) for est in clf.estimators_][:5])

# %%
from sklearn.tree import export_text

second_tree = clf.estimators_[1][0]
print(joblib.hash(second_tree.tree_))
print(export_text(second_tree, decimals=10, max_depth=100))
print(joblib.hash(second_tree.tree_.impurity))

Some key findings:

the first tree is identical for scipy 1.14.1 and scipy 1.15.0
the second tree is not identical but only differs wrt impurity all the rest seems exactly the same
the loss given by the GradientBoosting is slightly different (16th decimal)

scipy 1.14.1 output (score=0.62)

      Iter       Train Loss   Remaining Time 
         1 1.3031346873888148            0.34s
         2 1.2332030230874167            0.33s
         3 1.1717203225643884            0.33s
         4 1.1174320668883959            0.32s
         5 1.0548084758306693            0.31s
         6 0.9993879773505959            0.31s
         7 0.9580575095841599            0.30s
         8 0.9222968536421369            0.30s
         9 0.8858591714448600            0.30s
        10 0.8596018959266511            0.29s
        11 0.8234610031593187            0.29s
        12 0.7947498673644795            0.29s
        13 0.7653627222664898            0.29s
        14 0.7450631249903237            0.28s
        15 0.7162280716859813            0.28s
        16 0.6982154171604616            0.28s
        17 0.6705623402750668            0.27s
        18 0.6456791153472771            0.27s
        19 0.6280360014972848            0.27s
        20 0.6131350294776663            0.26s
        21 0.5992676733431259            0.26s
        22 0.5864773341011283            0.26s
        23 0.5729445833584860            0.25s
        24 0.5599316328546042            0.25s
        25 0.5453192111906160            0.25s
        26 0.5265456014234711            0.24s
        27 0.5149522069244638            0.24s
        28 0.4974834548586201            0.24s
        29 0.4780747556443493            0.23s
        30 0.4542496678981596            0.23s
        31 0.4409272164918828            0.23s
        32 0.4304938068980165            0.22s
        33 0.4138141177768845            0.22s
        34 0.4041484331213010            0.22s
        35 0.3959664034818218            0.21s
        36 0.3824560174382272            0.21s
        37 0.3716273989909005            0.21s
        38 0.3537127789444641            0.20s
        39 0.3455604430843203            0.20s
        40 0.3279765932771198            0.20s
        41 0.3172722762037838            0.19s
        42 0.3029701106285911            0.19s
        43 0.2922062908323000            0.19s
        44 0.2818142701171290            0.18s
        45 0.2759644986996963            0.18s
        46 0.2696230554982443            0.18s
        47 0.2632473125959979            0.17s
        48 0.2591024139756998            0.17s
        49 0.2528705435709728            0.17s
        50 0.2490297967420408            0.16s
        51 0.2436309333997958            0.16s
        52 0.2349376221577026            0.16s
        53 0.2273574589035918            0.15s
        54 0.2187492682653094            0.15s
        55 0.2143729661040189            0.15s
        56 0.2086445927810001            0.14s
        57 0.2041758258140417            0.14s
        58 0.1954509924207041            0.14s
        59 0.1904469561666429            0.13s
        60 0.1843991363310482            0.13s
        61 0.1799388041164273            0.13s
        62 0.1768572810284569            0.12s
        63 0.1737661078149119            0.12s
        64 0.1683929049157707            0.12s
        65 0.1627210531343629            0.11s
        66 0.1586436192699800            0.11s
        67 0.1539065254162819            0.11s
        68 0.1472677467883127            0.10s
        69 0.1412038272392375            0.10s
        70 0.1387907124065169            0.10s
        71 0.1345233348218365            0.09s
        72 0.1317032579171560            0.09s
        73 0.1298088310479556            0.09s
        74 0.1270377115423700            0.09s
        75 0.1234247271501200            0.08s
        76 0.1210321340293562            0.08s
        77 0.1174391792576234            0.08s
        78 0.1154998417708764            0.07s
        79 0.1134477449894605            0.07s
        80 0.1106352683849604            0.07s
        81 0.1068774561733941            0.06s
        82 0.1028151666176506            0.06s
        83 0.0990027619786546            0.06s
        84 0.0974211154325865            0.05s
        85 0.0955512284712263            0.05s
        86 0.0936564891257705            0.05s
        87 0.0911606526751379            0.04s
        88 0.0882395993745493            0.04s
        89 0.0857333070910218            0.04s
        90 0.0838003887236491            0.03s
        91 0.0809730187902697            0.03s
        92 0.0788725657796014            0.03s
        93 0.0767814499378289            0.02s
        94 0.0752604578961001            0.02s
        95 0.0737654081611234            0.02s
        96 0.0728320763418284            0.01s
        97 0.0712910470801800            0.01s
        98 0.0698062904968691            0.01s
        99 0.0684395024180467            0.00s
       100 0.0674112369464306            0.00s
0.62
['a385acb3f9390049de75e71077914420',
 '8cb5fa5b081d2659e7c14cc3552bf01c',
 '4d4217fbbaaa5c0e59f5d608d13b26ae',
 '50cee025c1aa47c179e305c3b4161898',
 '73bfc1da8bf179517244b8e49832c7fe']
d56b9e0586a4c5d348a53fd3f37fb6d8
|--- feature_49 <= -0.7416785955
|   |--- feature_62 <= -1.1938117146
|   |   |--- feature_74 <= -0.7086514533
|   |   |   |--- value: [1.7254004784]
|   |   |--- feature_74 >  -0.7086514533
|   |   |   |--- value: [-1.9224202823]
|   |--- feature_62 >  -1.1938117146
|   |   |--- feature_91 <= 1.3751169443
|   |   |   |--- value: [1.7254004784]
|   |   |--- feature_91 >  1.3751169443
|   |   |   |--- value: [-1.9224202823]
|--- feature_49 >  -0.7416785955
|   |--- feature_82 <= -0.3977906704
|   |   |--- feature_90 <= -0.2772663385
|   |   |   |--- value: [0.2673611228]
|   |   |--- feature_90 >  -0.2772663385
|   |   |   |--- value: [-1.6460506660]
|   |--- feature_82 >  -0.3977906704
|   |   |--- feature_98 <= -0.9155445397
|   |   |   |--- value: [-1.4461533807]
|   |   |--- feature_98 >  -0.9155445397
|   |   |   |--- value: [0.4797226668]

d7a3ac02bfb1ab2ba976e0ee57880771

scipy 1.15.0 output (score=0.68)

      Iter       Train Loss   Remaining Time 
         1 1.3031346873888148            0.34s
         2 1.2332030230874169            0.33s
         3 1.1717203225643884            0.32s
         4 1.1174320668883959            0.32s
         5 1.0548084758306693            0.31s
         6 0.9993879773505959            0.31s
         7 0.9580575095841599            0.30s
         8 0.9222968536421369            0.30s
         9 0.8858591714448600            0.30s
        10 0.8596018959266511            0.29s
        11 0.8234610031593187            0.29s
        12 0.7947498673644795            0.29s
        13 0.7653627222664898            0.28s
        14 0.7450631249903237            0.28s
        15 0.7162280716859813            0.28s
        16 0.6982154171604616            0.27s
        17 0.6705623402750668            0.27s
        18 0.6456791153472771            0.27s
        19 0.6280360014972848            0.26s
        20 0.6131350294776663            0.26s
        21 0.5992676733431259            0.26s
        22 0.5864773341011283            0.26s
        23 0.5729445833584860            0.25s
        24 0.5599316328546042            0.25s
        25 0.5453192111906160            0.25s
        26 0.5265456014234711            0.24s
        27 0.5149522069244639            0.24s
        28 0.4974834548586201            0.24s
        29 0.4780747556443493            0.23s
        30 0.4542496678981596            0.23s
        31 0.4409272164918828            0.23s
        32 0.4304938068980165            0.22s
        33 0.4138141177768845            0.22s
        34 0.4041484331213010            0.22s
        35 0.3959664034818218            0.21s
        36 0.3824560174382272            0.21s
        37 0.3716273989909005            0.21s
        38 0.3537127789444641            0.20s
        39 0.3455604430843203            0.20s
        40 0.3279765932771197            0.20s
        41 0.3172722762037838            0.19s
        42 0.3029701106285911            0.19s
        43 0.2922062908323000            0.19s
        44 0.2818142701171290            0.18s
        45 0.2759644986996963            0.18s
        46 0.2696230554982443            0.18s
        47 0.2632473125959979            0.17s
        48 0.2591024139756998            0.17s
        49 0.2528705435709728            0.17s
        50 0.2490297967420408            0.16s
        51 0.2436309333997958            0.16s
        52 0.2349376221577026            0.16s
        53 0.2273574589035917            0.15s
        54 0.2187492682653094            0.15s
        55 0.2143729661040189            0.15s
        56 0.2086445927810001            0.14s
        57 0.2041758258140417            0.14s
        58 0.1954509924207041            0.14s
        59 0.1904469561666429            0.13s
        60 0.1843991363310482            0.13s
        61 0.1799388041164273            0.13s
        62 0.1768572810284570            0.12s
        63 0.1737661078149119            0.12s
        64 0.1683929049157708            0.12s
        65 0.1627210531343629            0.11s
        66 0.1586436192699800            0.11s
        67 0.1539065254162819            0.11s
        68 0.1472677467883127            0.10s
        69 0.1412038272392375            0.10s
        70 0.1387907124065169            0.10s
        71 0.1345233348218365            0.09s
        72 0.1317032579171560            0.09s
        73 0.1298088310479556            0.09s
        74 0.1270377115423700            0.09s
        75 0.1234247271501200            0.08s
        76 0.1210321340293562            0.08s
        77 0.1174391792576234            0.08s
        78 0.1154998417708764            0.07s
        79 0.1134477449894605            0.07s
        80 0.1106352683849605            0.07s
        81 0.1068774561733941            0.06s
        82 0.1028151666176506            0.06s
        83 0.0990027619786546            0.06s
        84 0.0974211154325865            0.05s
        85 0.0955512284712264            0.05s
        86 0.0936564891257705            0.05s
        87 0.0911606526751379            0.04s
        88 0.0882395993745493            0.04s
        89 0.0857333070910218            0.04s
        90 0.0838003887236491            0.03s
        91 0.0809730187902697            0.03s
        92 0.0788725657796014            0.03s
        93 0.0767814499378289            0.02s
        94 0.0752604578961001            0.02s
        95 0.0737654081611233            0.02s
        96 0.0728320763418284            0.01s
        97 0.0712910470801800            0.01s
        98 0.0698062904968692            0.01s
        99 0.0684395024180466            0.00s
       100 0.0674112369464307            0.00s
0.68
['a385acb3f9390049de75e71077914420',
 'dcb4ef398a4ee30046e4ff6cb1cbaf83',
 '3447c6b87baddad6c584b5dfd226937c',
 '529d320b847ad1b468f64e4d424551e2',
 'cdc7564d89cfc829773998771432f655']
a5b1fc22ea2650b79f6286b016c5e6b9
|--- feature_49 <= -0.7416785955
|   |--- feature_62 <= -1.1938117146
|   |   |--- feature_74 <= -0.7086514533
|   |   |   |--- value: [1.7254004784]
|   |   |--- feature_74 >  -0.7086514533
|   |   |   |--- value: [-1.9224202823]
|   |--- feature_62 >  -1.1938117146
|   |   |--- feature_91 <= 1.3751169443
|   |   |   |--- value: [1.7254004784]
|   |   |--- feature_91 >  1.3751169443
|   |   |   |--- value: [-1.9224202823]
|--- feature_49 >  -0.7416785955
|   |--- feature_82 <= -0.3977906704
|   |   |--- feature_90 <= -0.2772663385
|   |   |   |--- value: [0.2673611228]
|   |   |--- feature_90 >  -0.2772663385
|   |   |   |--- value: [-1.6460506660]
|   |--- feature_82 >  -0.3977906704
|   |   |--- feature_98 <= -0.9155445397
|   |   |   |--- value: [-1.4461533807]
|   |   |--- feature_98 >  -0.9155445397
|   |   |   |--- value: [0.4797226668]

11a0d7df094a01cd1d23bc34a5fc00bc

I had a patch to increase the digits shown by gradient-boosting verbose:

diff --git a/sklearn/ensemble/_gb.py b/sklearn/ensemble/_gb.py
index fded8a5354..2daec967d8 100644
--- a/sklearn/ensemble/_gb.py
+++ b/sklearn/ensemble/_gb.py
@@ -298,7 +298,7 @@ class VerboseReporter:
         """
         # header fields and line format str
         header_fields = ["Iter", "Train Loss"]
-        verbose_fmt = ["{iter:>10d}", "{train_score:>16.4f}"]
+        verbose_fmt = ["{iter:>10d}", "{train_score:>16.16f}"]
         # do oob?
         if est.subsample < 1:
             header_fields.append("OOB Improve")

jeremiedbb · 2025-01-06T13:35:33Z

+1 to use HistGradientBoosting.
It's not very surprising that a small floating point difference at the beginning turns into a significant one at the end, especially when there are hard thresholding cuts in the middle, so I agree with you that we shouldn't be bothered too much by this change.

jeremiedbb

LGTM

lesteve · 2025-01-06T13:41:35Z

It's not very surprising that a small floating point difference at the beginning turns into a significant one at the end, especially when there are hard thresholding cuts in the middle, so I agree with you that we shouldn't be bothered too much by this change.

This is the kind of reassuring thing I wanted to hear, we should do a meme about it:

I am @jeremiedbb and I approve this floating point difference 😉

lesteve · 2025-01-06T13:44:49Z

There is still a bit of a mystery though, but I am willing to live with it like:

why is HistGradientBoostingClassifier more stable
why is there no difference until n_estimators=95 and then suddently from n_estimators=96 onwards there is a difference (OK I tried only a few values but this looks like this was the general behaviour ...)

I still think that part of the reason is that X and y are unrelated, which makes the problem kind of sensitive to numerical noise.

ogrisel · 2025-01-06T14:59:38Z

why is HistGradientBoostingClassifier more stable

Maybe because the splits in hist gradient boosting do not involve the computation of a floating-point threshold value (but instead used discretized integer values that are computed a priori).

why is there no difference until n_estimators=95 and then suddently from n_estimators=96 onwards there is a difference (OK I tried only a few values but this looks like this was the general behaviour ...)

Maybe this is because gradient boosting is sequentially fitting models on the residuals of the predictions of the previous models, so later iterations are more susceptible to be impacted by small changes in the computation of the residuals.

I still think that part of the reason is that X and y are unrelated, which makes the problem kind of sensitive to numerical noise.

It could be: there is little signal to extract from the training set before reaching many potential tree splits that are all approximately equally (not) informative.

lesteve · 2025-01-06T15:08:05Z

Thanks for the ML insights @ogrisel! This makes sense, in particular I did not think the binning could be a stabilizing factor.

…py 1.15 (scikit-learn#30583)

…py 1.15 (#30583)

lesteve added 2 commits January 4, 2025 12:07

TST Fix doctest due to floating point difference in numpy

859b40e

[azure parallel] [free-threaded] [scipy-dev]

ea03721

lesteve added the Quick Review For PRs that are quick to review label Jan 4, 2025

virchan approved these changes Jan 4, 2025

View reviewed changes

virchan added the Waiting for Second Reviewer First reviewer is done, need a second one! label Jan 4, 2025

[scipy-dev] [azure parallel] use HistGradientBoostingClassifier which…

c043775

… seems more stable for some reason

lesteve changed the title ~~TST Fix doctest due to floating point difference in numpy dev~~ TST Fix doctest due to difference in scipy 1.15 Jan 6, 2025

lesteve mentioned this pull request Jan 6, 2025

CI Use scipy 1.15 rather than scipy-dev for free-threaded build #30582

Merged

lesteve changed the title ~~TST Fix doctest due to difference in scipy 1.15~~ TST Fix doctest due to GradientBoostingClassifier difference with scipy 1.15 Jan 6, 2025

jeremiedbb approved these changes Jan 6, 2025

View reviewed changes

jeremiedbb merged commit 2605370 into scikit-learn:main Jan 6, 2025
34 checks passed

lesteve deleted the fix-doctest branch January 6, 2025 13:39

jeremiedbb mentioned this pull request Jan 6, 2025

🔒 🤖 CI Update lock files for main CI build(s) 🔒 🤖 #30593

Merged

jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request Jan 8, 2025

TST Fix doctest due to GradientBoostingClassifier difference with sci…

cf96f27

…py 1.15 (scikit-learn#30583)

jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request Jan 8, 2025

TST Fix doctest due to GradientBoostingClassifier difference with sci…

ac803bf

…py 1.15 (scikit-learn#30583)

jeremiedbb pushed a commit that referenced this pull request Jan 9, 2025

TST Fix doctest due to GradientBoostingClassifier difference with sci…

acbb862

…py 1.15 (#30583)

lc542 mentioned this pull request Feb 18, 2025

Add assert_docstring_consistency checks #30854

Closed

85 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

TST Fix doctest due to GradientBoostingClassifier difference with scipy 1.15 #30583

TST Fix doctest due to GradientBoostingClassifier difference with scipy 1.15 #30583

Uh oh!

lesteve commented Jan 4, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jan 4, 2025 •

edited

Loading

Uh oh!

virchan left a comment

Uh oh!

lesteve commented Jan 6, 2025 •

edited

Loading

Uh oh!

jeremiedbb commented Jan 6, 2025

Uh oh!

jeremiedbb left a comment

Uh oh!

Uh oh!

lesteve commented Jan 6, 2025 •

edited

Loading

Uh oh!

lesteve commented Jan 6, 2025

Uh oh!

ogrisel commented Jan 6, 2025

Uh oh!

lesteve commented Jan 6, 2025

Uh oh!

Uh oh!

Uh oh!

TST Fix doctest due to GradientBoostingClassifier difference with scipy 1.15 #30583

TST Fix doctest due to GradientBoostingClassifier difference with scipy 1.15 #30583

Uh oh!

Conversation

lesteve commented Jan 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

virchan left a comment

Choose a reason for hiding this comment

Uh oh!

lesteve commented Jan 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeremiedbb commented Jan 6, 2025

Uh oh!

jeremiedbb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lesteve commented Jan 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lesteve commented Jan 6, 2025

Uh oh!

ogrisel commented Jan 6, 2025

Uh oh!

lesteve commented Jan 6, 2025

Uh oh!

Uh oh!

lesteve commented Jan 4, 2025 •

edited

Loading

github-actions bot commented Jan 4, 2025 •

edited

Loading

lesteve commented Jan 6, 2025 •

edited

Loading

lesteve commented Jan 6, 2025 •

edited

Loading