-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
[MRG+1] Fix Bayesian ridge regression #12174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] Fix Bayesian ridge regression #12174
Conversation
|
woooottt I have not looked at this code for the last 8 years :) (git blame will confirm...) if the skipped test now passes with your changes then you have a non-regression test... thx @albertcthomas for looking into this |
|
The scikit-learn implementation uses the MacKay updates and according to section 2.2 in A New View of Automatic Relevance Determination (Wipf and Nagarajan, 2008) "MacKay updates do not even guarantee cost function decrease" so the skipped test should be removed. |
|
ok can you just insert comments in the code to say this? thx
|
|
Ok I removed the skipped test, added details to the docstring about the implementation and how the parameters are updated, and fixed the output values for |
b84a15e to
3d964ea
Compare
|
This PR is ready for reviews. I added a non regression test: AssertionError:
Arrays are not almost equal to 9 decimals
ACTUAL: -2901.7989775709366
DESIRED: -3410.6536570484113 |
agramfort
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
besides my nitpick LGTM
thx @albertcthomas
| ----- | ||
| For an example, see :ref:`examples/linear_model/plot_bayesian_ridge.py | ||
| <sphx_glr_auto_examples_linear_model_plot_bayesian_ridge.py>`. | ||
| There exist several strategies to perform Bayesian ridge regression. This |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exist -> exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 'exist' is correct here.
|
Thanks for the review @agramfort |
sklearn/linear_model/bayes.py
Outdated
| # return regularization parameters | ||
| self.alpha_ = alpha_ | ||
| self.lambda_ = lambda_ | ||
| # and corresponding posterior mean and posterior covariance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you recompute the posterior mean and posterior covariance, exactly as in the for-loop.
Could you refactor the two computations into a function, to avoid duplication?
And don't you need to update logdet_sigma_ also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And don't you need to update
logdet_sigma_also?
Indeed and rmse_ as well. Thanks!
|
You'll need a whats new entry. |
|
Thanks for the review @TomDLT. |
|
@TomDLT you put the PR number instead of the issue number in whatsnew. |
We usually do. |
|
Ah ok it makes sense actually :). thanks @TomDLT |
|
Thanks for the ping @auvipy! |
|
wc!! |
…update (scikit-learn#12174)" This reverts commit 1bdbbbd.
…update (scikit-learn#12174)" This reverts commit 1bdbbbd.
Reference Issues/PRs
This PR is taking over PR #11294 (duplicate of PR #10751) which is fixing the sign of log(det(Sigma)) in the computation of the marginal log likelihood in BayesianRidge.
Fixes issue #10748
What does this implement/fix? Explain your changes.
In order to implement a non regression test for the score I took some time to understand the code and I found some issues about the returned attributes and the doc.
The references given in the doc and the docstring are 'Bayesian interpolation' (MacKay, 1992) and slides from a lecture which seems to follow 'Pattern Recognition and Machine learning' by Bishop. However the scikit-learn implementation of Bayesian ridge regression (using Gamma distribution for the hyperpriors) is just Automatic Relevance determination with the same hyperparameter
lambdafor all coordinates ofw. Hence IMO the best reference to understand the code (and the one that I found to be the most helpful) is the one given for Automatic Relevance determination, i.e. , 'Sparse Bayesian Learning and the Relevance Vector Machine' by Tipping. So I would add this reference to the docstring and the doc of Bayesian ridge regression.The docstring of the
scores_attribute reads :value of the objective function (to be maximized). I think we should be clearer about what the objective function is here, i.e., the marginal log likelihood and maybe give the formula in the doc (equation (36) in the paper by Tipping) to be clear about what is done in the scikit learn implementationcoef_attribute does not correspond to the last update ofalpha_andlambda_whereas thesigma_matrix is the one corresponding to the last update. I think this is a bug forcoef_as when doing prediction thecoef_attribute should be the one updated thanks to the last update values ofalpha_andlambda_(see equation (19), (21) and (22) in Tipping):Besides it was decided to return the
alpha_andlambda_parameters corresponding to the (non updated)coef_attribute and not the last updates ofalpha_andlambda_in issue #8224. I think this is wrong as we should also return the last updates because there are the ones we should use in the posterior. If it is agreed that we should updatecoef_as explained above, the last updates ofalpha_andlambda_will correspond to the updatedcoef_value.The first test in
test_bayesis skipped because it is broken and I think we should fix it or remove it but I am not certain about the correct output yet, maybe @agramfort can help for this one? It checks that the scores are increasing.Fix sign of logdet Sigma in the score (previous PRs)
improve reading of the code: variable
sigmais now namedscaled_sigmaand update some of the commentsNon regression tests
Update doc and refs
Remove skipped increasing scores test.
Fix
coef_,lambda_andalpha_attributes.Closes #11294 #10751 #10748