Thanks to visit codestin.com
Credit goes to github.com

Skip to content

FIX fix comparison between array-like parameters when detecting non-default params for HTML representation #31528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jun 19, 2025

Conversation

DeaMariaLeon
Copy link
Contributor

Reference Issues/PRs

Fixes #31525

What does this implement/fix? Explain your changes.

As @glemaitre commented, we were comparing 2 different-size arrays:
param_value != init_default_params[param_name]

Any other comments?

😱

Copy link

github-actions bot commented Jun 12, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: c774f2c. Link to the linter CI: here

@jeremiedbb
Copy link
Member

Instead of wrapping it in a try except, we could use np.array_equal:

if not np.array_equal(
    param_value, init_default_params[param_name]
) and not (
    is_scalar_nan(init_default_params[param_name])
    and is_scalar_nan(param_value)
):

which should work because np.array_equal treats its arguments as array likes.
Ideally we would also set equal_nan=True but it doesn't work because it uses np.isnan which is not as generic as our is_scalar_nan :(

@jeremiedbb
Copy link
Member

Please also add a non regression test and a changelog entry. You can reuse the snippet from the issue.

@glemaitre
Copy link
Member

Instead of wrapping it in a try except, we could use np.array_equal

We probably want to do that only if one of the two parameters is an array-like. Otherwise, we want to use the Python comparison for scalars.

Please also add a non regression test and a changelog entry. You can reuse the snippet from the issue.

I'm wondering if we could strengthen the non-regression test by checking different combination of parameters (int, float, bool, array-like, ndarray, etc.).

This reverts commit 4556bf2.
@DeaMariaLeon
Copy link
Contributor Author

Thank you both @jeremiedbb and @glemaitre

If a default parameter is [1, 2] but the user enters np.array([1, 2]), should it be shown as the same? (as a default parameter). Guillaume believes they should be both shown as default, because the algorithm behaves the same way. But he suggested I ask the question here as well.

@jeremiedbb
Copy link
Member

If a default parameter is [1, 2] but the user enters np.array([1, 2]), should it be shown as the same? (as a default parameter).

I'm okay to show it as the same because we don't have a way to differentiate for scalar values anyway (since we're not using sentinels for default values, but we're not going to do that).

We probably want to do that only if one of the two parameters is an array-like

Why only for array-likes ? it also works for scalar values being any kind of object.

@glemaitre
Copy link
Member

Why only for array-likes ? it also works for scalar values being any kind of object.

Do you want to call np.array_equal on scalar as well? Since it is array-equal, I would have just limit it for array-like because it seems natural (I would be scared about the conversion from a scalar to a numpy array but indeed there is no reason to).

@DeaMariaLeon
Copy link
Contributor Author

@glemaitre @jeremiedbb
Could you tell me if what I did is OK please?
Thanks in advance

@glemaitre glemaitre self-requested a review June 13, 2025 12:56
@glemaitre glemaitre changed the title FIX RideCV diagram representation with non-default alphas FIX fix comparison between array-like parameters when detecting non-default params for HTML representation Jun 13, 2025
@jeremiedbb
Copy link
Member

Do you want to call np.array_equal on scalar as well? Since it is array-equal, I would have just limit it for array-like because it seems natural (I would be scared about the conversion from a scalar to a numpy array but indeed there is no reason to).

Yes because it works (at least for our use cases as far as I can tell) with a single condition :)
But I'd understand if you find it to convoluted and prefer to be explicit.

@DeaMariaLeon
Copy link
Contributor Author

I added a couple of cases: 2 different size arrays, and one int vs float. Also removed the the f string after the assert, I can put it back.. (but the message needs a bit of tweaking I think).

@DeaMariaLeon
Copy link
Contributor Author

Have you considered using hypothesis to test/find edge cases? I mean in general, in the past. I don't mean it should be used now. It's quite good to find them, as you might know.

https://hypothesis.readthedocs.io/en/latest/quickstart.html

@glemaitre
Copy link
Member

Have you considered using hypothesis to test/find edge cases? I mean in general, in the past. I don't mean it should be used now. It's quite good to find them, as you might know.

There is a discussion that have been started here: #13846

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on my side.

@glemaitre glemaitre added this to the 1.7.1 milestone Jun 13, 2025
@glemaitre
Copy link
Member

I added the PR into the 1.7.1 milestone. Thanks @DeaMariaLeon

@adrinjalali
Copy link
Member

We special case BaseEstimator here, what happens to scorers and cv splitters and all other non scalar and non-numpy objects?

@glemaitre glemaitre self-requested a review June 18, 2025 15:36
@adrinjalali adrinjalali merged commit b39ab89 into scikit-learn:main Jun 19, 2025
34 checks passed
@DeaMariaLeon DeaMariaLeon deleted the ridge branch June 19, 2025 09:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issue with the RidgeCV diagram representation with non-default alphas
4 participants