FIX fix comparison between array-like parameters when detecting non-default params for HTML representation #31528

DeaMariaLeon · 2025-06-12T06:55:34Z

Reference Issues/PRs

Fixes #31525

What does this implement/fix? Explain your changes.

As @glemaitre commented, we were comparing 2 different-size arrays:
param_value != init_default_params[param_name]

Any other comments?

😱

github-actions · 2025-06-12T06:56:30Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: c774f2c. Link to the linter CI: here}

jeremiedbb · 2025-06-12T09:02:05Z

Instead of wrapping it in a try except, we could use np.array_equal:

if not np.array_equal(
    param_value, init_default_params[param_name]
) and not (
    is_scalar_nan(init_default_params[param_name])
    and is_scalar_nan(param_value)
):

which should work because np.array_equal treats its arguments as array likes.
Ideally we would also set equal_nan=True but it doesn't work because it uses np.isnan which is not as generic as our is_scalar_nan :(

jeremiedbb · 2025-06-12T09:03:18Z

Please also add a non regression test and a changelog entry. You can reuse the snippet from the issue.

glemaitre · 2025-06-12T09:40:56Z

Instead of wrapping it in a try except, we could use np.array_equal

We probably want to do that only if one of the two parameters is an array-like. Otherwise, we want to use the Python comparison for scalars.

Please also add a non regression test and a changelog entry. You can reuse the snippet from the issue.

I'm wondering if we could strengthen the non-regression test by checking different combination of parameters (int, float, bool, array-like, ndarray, etc.).

This reverts commit 4556bf2.

DeaMariaLeon · 2025-06-12T16:44:32Z

Thank you both @jeremiedbb and @glemaitre

If a default parameter is [1, 2] but the user enters np.array([1, 2]), should it be shown as the same? (as a default parameter). Guillaume believes they should be both shown as default, because the algorithm behaves the same way. But he suggested I ask the question here as well.

jeremiedbb · 2025-06-12T20:53:23Z

If a default parameter is [1, 2] but the user enters np.array([1, 2]), should it be shown as the same? (as a default parameter).

I'm okay to show it as the same because we don't have a way to differentiate for scalar values anyway (since we're not using sentinels for default values, but we're not going to do that).

We probably want to do that only if one of the two parameters is an array-like

Why only for array-likes ? it also works for scalar values being any kind of object.

glemaitre · 2025-06-13T09:39:47Z

Why only for array-likes ? it also works for scalar values being any kind of object.

Do you want to call np.array_equal on scalar as well? Since it is array-equal, I would have just limit it for array-like because it seems natural (I would be scared about the conversion from a scalar to a numpy array but indeed there is no reason to).

DeaMariaLeon · 2025-06-13T11:12:09Z

@glemaitre @jeremiedbb
Could you tell me if what I did is OK please?
Thanks in advance

sklearn/tests/test_base.py

doc/whats_new/upcoming_changes/sklearn.base/31528.fix.rst

jeremiedbb · 2025-06-13T14:52:10Z

Do you want to call np.array_equal on scalar as well? Since it is array-equal, I would have just limit it for array-like because it seems natural (I would be scared about the conversion from a scalar to a numpy array but indeed there is no reason to).

Yes because it works (at least for our use cases as far as I can tell) with a single condition :)
But I'd understand if you find it to convoluted and prefer to be explicit.

DeaMariaLeon · 2025-06-13T15:01:43Z

I added a couple of cases: 2 different size arrays, and one int vs float. Also removed the the f string after the assert, I can put it back.. (but the message needs a bit of tweaking I think).

DeaMariaLeon · 2025-06-13T15:23:49Z

Have you considered using hypothesis to test/find edge cases? I mean in general, in the past. I don't mean it should be used now. It's quite good to find them, as you might know.

https://hypothesis.readthedocs.io/en/latest/quickstart.html

glemaitre · 2025-06-13T15:41:23Z

Have you considered using hypothesis to test/find edge cases? I mean in general, in the past. I don't mean it should be used now. It's quite good to find them, as you might know.

There is a discussion that have been started here: #13846

glemaitre

LGTM on my side.

glemaitre · 2025-06-13T15:48:01Z

I added the PR into the 1.7.1 milestone. Thanks @DeaMariaLeon

adrinjalali · 2025-06-18T14:37:35Z

We special case BaseEstimator here, what happens to scorers and cv splitters and all other non scalar and non-numpy objects?

sklearn/tests/test_base.py

bug fix rideCV

4556bf2

Revert "bug fix rideCV"

ef2daf0

This reverts commit 4556bf2.

DeaMariaLeon added 2 commits June 13, 2025 11:13

wip

6330cc4

wip

173a5bd

DeaMariaLeon added 2 commits June 13, 2025 12:04

After first feedback

5bb3ed8

added array vs scalar

a2e89ea

glemaitre self-requested a review June 13, 2025 12:56

glemaitre reviewed Jun 13, 2025

View reviewed changes

sklearn/tests/test_base.py Outdated Show resolved Hide resolved

glemaitre reviewed Jun 13, 2025

View reviewed changes

doc/whats_new/upcoming_changes/sklearn.base/31528.fix.rst Outdated Show resolved Hide resolved

glemaitre changed the title ~~FIX RideCV diagram representation with non-default alphas~~ FIX fix comparison between array-like parameters when detecting non-default params for HTML representation Jun 13, 2025

Added more generic to cover more cases

b22b13b

Removed test - not needed

12ebf8e

glemaitre approved these changes Jun 13, 2025

View reviewed changes

glemaitre added this to the 1.7.1 milestone Jun 13, 2025

glemaitre self-requested a review June 18, 2025 15:36

glemaitre reviewed Jun 18, 2025

View reviewed changes

sklearn/tests/test_base.py Show resolved Hide resolved

DeaMariaLeon added 2 commits June 18, 2025 17:49

Added to KFOLD and get_scorer to test

95ab5f0

Merge remote-tracking branch 'upstream/main' into ridge

c774f2c

adrinjalali approved these changes Jun 19, 2025

View reviewed changes

adrinjalali merged commit b39ab89 into scikit-learn:main Jun 19, 2025
34 checks passed

DeaMariaLeon deleted the ridge branch June 19, 2025 09:16

Uh oh!

FIX fix comparison between array-like parameters when detecting non-default params for HTML representation #31528

FIX fix comparison between array-like parameters when detecting non-default params for HTML representation #31528

Uh oh!

Conversation

DeaMariaLeon commented Jun 12, 2025

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

jeremiedbb commented Jun 12, 2025

Uh oh!

jeremiedbb commented Jun 12, 2025

Uh oh!

glemaitre commented Jun 12, 2025

Uh oh!

DeaMariaLeon commented Jun 12, 2025

Uh oh!

jeremiedbb commented Jun 12, 2025

Uh oh!

glemaitre commented Jun 13, 2025

Uh oh!

DeaMariaLeon commented Jun 13, 2025

Uh oh!

Uh oh!

Uh oh!

jeremiedbb commented Jun 13, 2025

Uh oh!

DeaMariaLeon commented Jun 13, 2025

Uh oh!

DeaMariaLeon commented Jun 13, 2025

Uh oh!

glemaitre commented Jun 13, 2025

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

glemaitre commented Jun 13, 2025

Uh oh!

adrinjalali commented Jun 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Jun 12, 2025 •

edited

Loading