-
-
Notifications
You must be signed in to change notification settings - Fork 26k
[MRG] FIX and ENH in _RidgeGCV #15648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
23 commits
Select commit
Hold shift + click to select a range
3aac2e5
do not store all cv values nor all dual coef in _RidgeGCV fit
jeromedockes 51b9add
Apply suggestions from code review
jeromedockes a78be8c
whatsnew entry
jeromedockes bfb6d98
add tests for different scorers in _RidgeGCV
jeromedockes c7baa7c
add note in _RidgeGCV docstring
jeromedockes 61782b3
move whatsnew entry
jeromedockes 89dde17
git move ridge -> _ridge
glemaitre 02e694e
Merge remote-tracking branch 'origin/master' into pr/jeromedockes/15183
glemaitre b4cf560
TST add additional tests
glemaitre 98d2f07
fix
glemaitre 097e783
TST check equivalence scoring none and mse
glemaitre 5188494
MNT FIX and ENH in _RidgeGCV
qinhanmin2014 2de9c67
trigger CI
qinhanmin2014 ff0a5ab
small bug
qinhanmin2014 485a72f
small bug
qinhanmin2014 34a56b4
Merge remote-tracking branch 'upstream/master' into 15183
qinhanmin2014 fef0b46
clean what's new
qinhanmin2014 4e183e6
remove tests
qinhanmin2014 82ed115
remove redundant test (we no longer return weighted error)
qinhanmin2014 703a93e
clearer solution
qinhanmin2014 90b7c68
add some tests
qinhanmin2014 72c1d1b
flake8
qinhanmin2014 cb3c379
more tests
qinhanmin2014 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1054,6 +1054,16 @@ def _matmat(self, v): | |
return res | ||
|
||
|
||
class _IdentityEstimator: | ||
"""Hack to call a scorer when we already have the predictions.""" | ||
|
||
def decision_function(self, y_predict): | ||
return y_predict | ||
|
||
def predict(self, y_predict): | ||
return y_predict | ||
|
||
|
||
class _RidgeGCV(LinearModel): | ||
"""Ridge regression with built-in Generalized Cross-Validation | ||
|
||
|
@@ -1087,6 +1097,10 @@ class _RidgeGCV(LinearModel): | |
|
||
looe = y - loov = c / diag(G^-1) | ||
|
||
The best score (negative mean squared error or user-provided scoring) is | ||
stored in the `best_score_` attribute, and the selected hyperparameter in | ||
`alpha_`. | ||
|
||
References | ||
---------- | ||
http://cbcl.mit.edu/publications/ps/MIT-CSAIL-TR-2007-025.pdf | ||
|
@@ -1460,45 +1474,59 @@ def fit(self, X, y, sample_weight=None): | |
X, y = _rescale_data(X, y, sample_weight) | ||
sqrt_sw = np.sqrt(sample_weight) | ||
else: | ||
sqrt_sw = np.ones(X.shape[0], dtype=X.dtype) | ||
sample_weight = sqrt_sw = np.ones(X.shape[0], dtype=X.dtype) | ||
|
||
X_mean, *decomposition = decompose(X, y, sqrt_sw) | ||
|
||
scorer = check_scoring(self, scoring=self.scoring, allow_none=True) | ||
error = scorer is None | ||
|
||
n_y = 1 if len(y.shape) == 1 else y.shape[1] | ||
cv_values = np.zeros((n_samples * n_y, len(self.alphas)), | ||
dtype=X.dtype) | ||
C = [] | ||
X_mean, *decomposition = decompose(X, y, sqrt_sw) | ||
|
||
if self.store_cv_values: | ||
self.cv_values_ = np.empty( | ||
(n_samples * n_y, len(self.alphas)), dtype=X.dtype) | ||
|
||
best_coef, best_score, best_alpha = None, None, None | ||
|
||
for i, alpha in enumerate(self.alphas): | ||
G_inverse_diag, c = solve( | ||
float(alpha), y, sqrt_sw, X_mean, *decomposition) | ||
if error: | ||
squared_errors = (c / G_inverse_diag) ** 2 | ||
cv_values[:, i] = squared_errors.ravel() | ||
# convert errors back to the original space | ||
# return converted errors | ||
# calculate scores based on converted errors | ||
if y.ndim == 2: | ||
squared_errors /= sample_weight[:, np.newaxis] | ||
else: | ||
squared_errors /= sample_weight | ||
# consistent with default multioutput of mean_squared_error | ||
alpha_score = -squared_errors.mean() | ||
if self.store_cv_values: | ||
self.cv_values_[:, i] = squared_errors.ravel() | ||
else: | ||
predictions = y - (c / G_inverse_diag) | ||
cv_values[:, i] = predictions.ravel() | ||
C.append(c) | ||
|
||
if error: | ||
best = cv_values.mean(axis=0).argmin() | ||
else: | ||
# The scorer want an object that will make the predictions but | ||
# they are already computed efficiently by _RidgeGCV. This | ||
# identity_estimator will just return them | ||
def identity_estimator(): | ||
pass | ||
identity_estimator.decision_function = lambda y_predict: y_predict | ||
identity_estimator.predict = lambda y_predict: y_predict | ||
|
||
# signature of scorer is (estimator, X, y) | ||
out = [scorer(identity_estimator, cv_values[:, i], y.ravel()) | ||
for i in range(len(self.alphas))] | ||
best = np.argmax(out) | ||
|
||
self.alpha_ = self.alphas[best] | ||
self.dual_coef_ = C[best] | ||
# convert predictions back to the original space | ||
# return converted predictions | ||
# calculate scores based on converted predictions | ||
if y.ndim == 2: | ||
y_true = y / sqrt_sw[:, np.newaxis] + y_offset | ||
y_pred = predictions / sqrt_sw[:, np.newaxis] + y_offset | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you don't need to create new axis. you can ravel and use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this memory efficient? I guess broadcast is better? |
||
else: | ||
y_true = y / sqrt_sw + y_offset | ||
y_pred = predictions / sqrt_sw + y_offset | ||
# let the underlying scorer to handle multioutput | ||
alpha_score = scorer( | ||
_IdentityEstimator(), y_pred, y_true) | ||
if self.store_cv_values: | ||
self.cv_values_[:, i] = y_pred.ravel() | ||
if (best_score is None) or (alpha_score > best_score): | ||
best_coef, best_score, best_alpha = c, alpha_score, alpha | ||
|
||
self.alpha_ = best_alpha | ||
self.best_score_ = best_score | ||
self.dual_coef_ = best_coef | ||
self.coef_ = safe_sparse_dot(self.dual_coef_.T, X) | ||
|
||
X_offset += X_mean * X_scale | ||
|
@@ -1509,7 +1537,7 @@ def identity_estimator(): | |
cv_values_shape = n_samples, len(self.alphas) | ||
else: | ||
cv_values_shape = n_samples, n_y, len(self.alphas) | ||
self.cv_values_ = cv_values.reshape(cv_values_shape) | ||
self.cv_values_ = self.cv_values_.reshape(cv_values_shape) | ||
|
||
return self | ||
|
||
|
@@ -1565,6 +1593,7 @@ def fit(self, X, y, sample_weight=None): | |
store_cv_values=self.store_cv_values) | ||
estimator.fit(X, y, sample_weight=sample_weight) | ||
self.alpha_ = estimator.alpha_ | ||
self.best_score_ = estimator.best_score_ | ||
if self.store_cv_values: | ||
self.cv_values_ = estimator.cv_values_ | ||
else: | ||
|
@@ -1580,6 +1609,7 @@ def fit(self, X, y, sample_weight=None): | |
gs.fit(X, y, sample_weight=sample_weight) | ||
estimator = gs.best_estimator_ | ||
self.alpha_ = gs.best_estimator_.alpha | ||
self.best_score_ = gs.best_score_ | ||
|
||
self.coef_ = estimator.coef_ | ||
self.intercept_ = estimator.intercept_ | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not that easy. There is a test
ridge_sample_weight
which will fail.Right now it was thought that repeating 3 times a sample will lead to an error 3 times bigger.
Normalizing the sample_weight will not lead to this results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this part makes sure that
RidgeCV()
is equivalent toGridSearchCV(Ridge(), cv=LeaveOneOut())