-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
FIX scoring != None for RidgeCV should used unscaled y for evaluation #29842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX scoring != None for RidgeCV should used unscaled y for evaluation #29842
Conversation
Cross-linking #16298 as it might be related. |
So I added a test that was failing on I'm going to open another PR to not overload this PR. |
I added a new parametrization to check that we support multioutput properly. |
Co-authored-by: Christian Lorentzen <[email protected]>
@glemaitre Could you fix the typos in the whatsnew entry? |
…scikit-learn#29842) Co-authored-by: Christian Lorentzen <[email protected]>
closes #13998
closes #15648
While discussing with @jeromedockes, we recall to have observed something weird in the
RidgeCV
code. I check a bit closer and I open this PR to highlight what is the potential problem.In
RidgeCV
, when havingsample_weight
we scale the data using thesqrt(sample_weight)
:scikit-learn/sklearn/linear_model/_ridge.py
Lines 2133 to 2136 in 35164b3
The idea is that the mean squared error can be expressed as:
scikit-learn/sklearn/linear_model/_base.py
Lines 212 to 223 in 35164b3
Those "centered" data are used to optimize the ridge loss. Later in the code, we want to compute a score that can be an arbitrary metric via a scorer.
scikit-learn/sklearn/linear_model/_ridge.py
Lines 2158 to 2169 in 35164b3
The problem here is that
predictions
is computed efficiently as provided in the GCV paper. But these predictions are in the "scaled" space and it seems incorrect to compute any metric in this space with an arbitrary metric. Instead, we should unscale these predictions and the scaled true targets to compute the metric in the original space.This is what this PR is intended to. I did not add any non-regression test (I assume that using the MedAE should lead to some failures) because I wanted to be sure that what I'm saying is correct.
@jeromedockes @ogrisel @lorentzenchr Does the above description make sense to you?
Edit: It seems that it relates to #13998 and #15648
Probably, I should check the tests that were written in #15648