Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG] FIX and ENH in _RidgeGCV #15648

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 23 commits into from

Conversation

qinhanmin2014
Copy link
Member

@qinhanmin2014 qinhanmin2014 commented Nov 18, 2019

Fixes #4667 Fixes #4790 Fixes #13998 Fixes #15182 Fixes #15183

This PR focus on RidgeCV

TODO in this PR: update the doc, update what's new

TODO: issues regarding RidgeClassifierCV
ping @glemaitre feel free to edit or push

self.dual_coef_ = C[best]
if y.ndim == 2:
y_true = y / sqrt_sw[:, np.newaxis] + y_offset
y_pred = predictions / sqrt_sw[:, np.newaxis] + y_offset
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need to create new axis. you can ravel and use np.repeat(sqrt_sw, n_y)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this memory efficient? I guess broadcast is better?

if y.ndim == 2:
squared_errors /= sample_weight[:, np.newaxis]
else:
squared_errors /= sample_weight
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not that easy. There is a test ridge_sample_weight which will fail.
Right now it was thought that repeating 3 times a sample will lead to an error 3 times bigger.
Normalizing the sample_weight will not lead to this results

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part makes sure that RidgeCV() is equivalent to GridSearchCV(Ridge(), cv=LeaveOneOut())


assert (
ridge_cv_no_score.cv_values_.mean() == pytest.approx(
mean_squared_error(y, ridge_cv_score.cv_values_.ravel()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my above message, we could expect to pass sample_weight to mean_squared_error. However, the np.average will normalize by the sum of the weight. It is equivalent to normalize the weight such that they sum to 1.

However, the loss internally will not do such thing because we want to impose that 3 times a sample correspond to seeing 3 times a sample increasing the loss x3.

So this is not straightforward what to implement.

Basically, with the initial semantic the assert would be

ridge_cv_no_score.cv_values_.mean() == pytest.approx(
    (((y - ridge_cv_score.cv_values_.ravel()) ** 2) * sample_weight).sum()
)

while the mean_squared_error would be equivalent to

ridge_cv_no_score.cv_values_.mean() == pytest.approx(
    (((y - ridge_cv_score.cv_values_.ravel()) ** 2) * sample_weight / sample_weight.sum()).sum()
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, the loss internally will not do such thing because we want to impose that 3 times a sample correspond to seeing 3 times a sample increasing the loss x3.

Sorry but I can't understand your discussions, this PR is not related to the loss function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got confused as well so please ignore my previous comment. As far as I can tell the changes made in this PR yield the correct behaviour for _RidgeGCV

@qinhanmin2014
Copy link
Member Author

TODO:
(1) debug test_ridge_gcv_sample_weights (I comment out that test)
(2) more tests

@qinhanmin2014
Copy link
Member Author

@glemaitre Let me summarize my solution when scoring = None
(1) For the errors, I think we should not report weighted errors.
(2) For the scores, I think we should keep consistent with GridSearchCV (i.e., RidgeCV() should be equivalent to GridSearchCV(Ridge(), cv=LeaveOneOut()))
Do you agree?

`store_cv_values` is `True`.
:pr:`15183` by :user:`Jérôme Dockès <jeromedockes>`.

- |Fix| In :class:`linear_model.RidgeCV`, the predicitons reported by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a confusion in this whatsnew entry. the problem mentioned is described in issue #13998 and is not fixed in this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I saw you fixed it too -- then maybe it is the pr number that needs to change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's focus on the PR itself and ignore what's new now.

@jeromedockes
Copy link
Contributor

(1) debug test_ridge_gcv_sample_weights (I comment out that test)

it checks that giving a sample weight of 3 gives the same score as repeating the sample 3 times. that is no longer the case if sample weights are not used to compute the scores. therefore instead of repeating samples and using GroupKFold this test should now simply compare the GCV with LOO GridSearch as you suggest in #15648 (comment)

@jeromedockes
Copy link
Contributor

(1) debug test_ridge_gcv_sample_weights (I comment out that test)

it checks that giving a sample weight of 3 gives the same score as repeating the sample 3 times. that is no longer the case if sample weights are not used to compute the scores. therefore instead of repeating samples and using GroupKFold this test should now simply compare the GCV with LOO GridSearch as you suggest in #15648 (comment)

however this test should probably be kept but applied with only one hyperparameter in the grid, to check that for the coefficients and intercept giving sample weights is indeed equivalent to repeating samples -- for a fixed hyperparameter, not for computing the score

@qinhanmin2014
Copy link
Member Author

it checks that giving a sample weight of 3 gives the same score as repeating the sample 3 times. that is no longer the case if sample weights are not used to compute the scores. therefore instead of repeating samples and using GroupKFold this test should now simply compare the GCV with LOO GridSearch as you suggest in #15648 (comment)

thanks a lot :) I've figured out this reason and mentioned it in gitter.

@qinhanmin2014
Copy link
Member Author

ping @glemaitre @jeromedockes I add some tests here, perhaps its worthwhile for you to have a look.

@qinhanmin2014 qinhanmin2014 changed the title [WIP] FIX and ENH in _RidgeGCV [MRG] FIX and ENH in _RidgeGCV Nov 22, 2019
Base automatically changed from master to main January 22, 2021 10:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants