-
-
Notifications
You must be signed in to change notification settings - Fork 26k
[MRG] FIX and ENH in _RidgeGCV #15648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-Authored-By: Thomas J Fan <[email protected]>
self.dual_coef_ = C[best] | ||
if y.ndim == 2: | ||
y_true = y / sqrt_sw[:, np.newaxis] + y_offset | ||
y_pred = predictions / sqrt_sw[:, np.newaxis] + y_offset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need to create new axis. you can ravel and use np.repeat(sqrt_sw, n_y)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this memory efficient? I guess broadcast is better?
if y.ndim == 2: | ||
squared_errors /= sample_weight[:, np.newaxis] | ||
else: | ||
squared_errors /= sample_weight |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not that easy. There is a test ridge_sample_weight
which will fail.
Right now it was thought that repeating 3 times a sample will lead to an error 3 times bigger.
Normalizing the sample_weight will not lead to this results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this part makes sure that RidgeCV()
is equivalent to GridSearchCV(Ridge(), cv=LeaveOneOut())
|
||
assert ( | ||
ridge_cv_no_score.cv_values_.mean() == pytest.approx( | ||
mean_squared_error(y, ridge_cv_score.cv_values_.ravel())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my above message, we could expect to pass sample_weight
to mean_squared_error
. However, the np.average will normalize by the sum of the weight. It is equivalent to normalize the weight such that they sum to 1.
However, the loss internally will not do such thing because we want to impose that 3 times a sample correspond to seeing 3 times a sample increasing the loss x3.
So this is not straightforward what to implement.
Basically, with the initial semantic the assert would be
ridge_cv_no_score.cv_values_.mean() == pytest.approx(
(((y - ridge_cv_score.cv_values_.ravel()) ** 2) * sample_weight).sum()
)
while the mean_squared_error
would be equivalent to
ridge_cv_no_score.cv_values_.mean() == pytest.approx(
(((y - ridge_cv_score.cv_values_.ravel()) ** 2) * sample_weight / sample_weight.sum()).sum()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, the loss internally will not do such thing because we want to impose that 3 times a sample correspond to seeing 3 times a sample increasing the loss x3.
Sorry but I can't understand your discussions, this PR is not related to the loss function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got confused as well so please ignore my previous comment. As far as I can tell the changes made in this PR yield the correct behaviour for _RidgeGCV
TODO: |
@glemaitre Let me summarize my solution when scoring = None |
doc/whats_new/v0.22.rst
Outdated
`store_cv_values` is `True`. | ||
:pr:`15183` by :user:`Jérôme Dockès <jeromedockes>`. | ||
|
||
- |Fix| In :class:`linear_model.RidgeCV`, the predicitons reported by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a confusion in this whatsnew entry. the problem mentioned is described in issue #13998 and is not fixed in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I saw you fixed it too -- then maybe it is the pr number that needs to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's focus on the PR itself and ignore what's new now.
it checks that giving a sample weight of 3 gives the same score as repeating the sample 3 times. that is no longer the case if sample weights are not used to compute the scores. therefore instead of repeating samples and using GroupKFold this test should now simply compare the GCV with LOO GridSearch as you suggest in #15648 (comment) |
however this test should probably be kept but applied with only one hyperparameter in the grid, to check that for the coefficients and intercept giving sample weights is indeed equivalent to repeating samples -- for a fixed hyperparameter, not for computing the score |
thanks a lot :) I've figured out this reason and mentioned it in gitter. |
ping @glemaitre @jeromedockes I add some tests here, perhaps its worthwhile for you to have a look. |
Fixes #4667 Fixes #4790 Fixes #13998 Fixes #15182 Fixes #15183
This PR focus on RidgeCV
TODO in this PR: update the doc, update what's new
TODO: issues regarding RidgeClassifierCV
ping @glemaitre feel free to edit or push