Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@albertcthomas
Copy link
Contributor

@albertcthomas albertcthomas commented Sep 26, 2018

Reference Issues/PRs

This PR is taking over PR #11294 (duplicate of PR #10751) which is fixing the sign of log(det(Sigma)) in the computation of the marginal log likelihood in BayesianRidge.
Fixes issue #10748

What does this implement/fix? Explain your changes.

In order to implement a non regression test for the score I took some time to understand the code and I found some issues about the returned attributes and the doc.

  • The references given in the doc and the docstring are 'Bayesian interpolation' (MacKay, 1992) and slides from a lecture which seems to follow 'Pattern Recognition and Machine learning' by Bishop. However the scikit-learn implementation of Bayesian ridge regression (using Gamma distribution for the hyperpriors) is just Automatic Relevance determination with the same hyperparameter lambda for all coordinates of w. Hence IMO the best reference to understand the code (and the one that I found to be the most helpful) is the one given for Automatic Relevance determination, i.e. , 'Sparse Bayesian Learning and the Relevance Vector Machine' by Tipping. So I would add this reference to the docstring and the doc of Bayesian ridge regression.

  • The docstring of the scores_ attribute reads : value of the objective function (to be maximized). I think we should be clearer about what the objective function is here, i.e., the marginal log likelihood and maybe give the formula in the doc (equation (36) in the paper by Tipping) to be clear about what is done in the scikit learn implementation

objective_function

  • The returned coef_ attribute does not correspond to the last update of alpha_ and lambda_ whereas the sigma_ matrix is the one corresponding to the last update. I think this is a bug for coef_ as when doing prediction the coef_ attribute should be the one updated thanks to the last update values of alpha_ and lambda_ (see equation (19), (21) and (22) in Tipping):
    prediction

Besides it was decided to return the alpha_ and lambda_ parameters corresponding to the (non updated) coef_ attribute and not the last updates of alpha_ and lambda_ in issue #8224. I think this is wrong as we should also return the last updates because there are the ones we should use in the posterior. If it is agreed that we should update coef_ as explained above, the last updates of alpha_ and lambda_ will correspond to the updated coef_ value.

  • The first test in test_bayes is skipped because it is broken and I think we should fix it or remove it but I am not certain about the correct output yet, maybe @agramfort can help for this one? It checks that the scores are increasing.

  • Fix sign of logdet Sigma in the score (previous PRs)

  • improve reading of the code: variable sigma is now named scaled_sigma and update some of the comments

  • Non regression tests

  • Update doc and refs

  • Remove skipped increasing scores test.

  • Fix coef_, lambda_ and alpha_ attributes.

Closes #11294 #10751 #10748

@agramfort
Copy link
Member

woooottt I have not looked at this code for the last 8 years :) (git blame will confirm...)

if the skipped test now passes with your changes then you have a non-regression test...

thx @albertcthomas for looking into this

@albertcthomas
Copy link
Contributor Author

The scikit-learn implementation uses the MacKay updates and according to section 2.2 in A New View of Automatic Relevance Determination (Wipf and Nagarajan, 2008) "MacKay updates do not even guarantee cost function decrease" so the skipped test should be removed.

@agramfort
Copy link
Member

agramfort commented Sep 28, 2018 via email

@albertcthomas
Copy link
Contributor Author

Ok I removed the skipped test, added details to the docstring about the implementation and how the parameters are updated, and fixed the output values for coef_, alpha_ and lambda_. It should now be possible to add a non regression test for the score computation (main motivation of this PR :)). Will do that soon, as well as adding a few more details in the doc.

@albertcthomas albertcthomas changed the title [WIP] Bayesian ridge regression [MRG] Bayesian ridge regression Oct 1, 2018
@albertcthomas
Copy link
Contributor Author

This PR is ready for reviews. I added a non regression test: test_bayesian_ridge_score_values.py. The test passes with this PR. When run on master the test returns

AssertionError:
Arrays are not almost equal to 9 decimals
 ACTUAL: -2901.7989775709366
 DESIRED: -3410.6536570484113

@albertcthomas albertcthomas changed the title [MRG] Bayesian ridge regression [MRG] Fix Bayesian ridge regression Oct 1, 2018
Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

besides my nitpick LGTM

thx @albertcthomas

-----
For an example, see :ref:`examples/linear_model/plot_bayesian_ridge.py
<sphx_glr_auto_examples_linear_model_plot_bayesian_ridge.py>`.
There exist several strategies to perform Bayesian ridge regression. This
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exist -> exists

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 'exist' is correct here.

@agramfort agramfort changed the title [MRG] Fix Bayesian ridge regression [MRG+1] Fix Bayesian ridge regression Oct 7, 2018
@albertcthomas
Copy link
Contributor Author

Thanks for the review @agramfort

# return regularization parameters
self.alpha_ = alpha_
self.lambda_ = lambda_
# and corresponding posterior mean and posterior covariance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you recompute the posterior mean and posterior covariance, exactly as in the for-loop.
Could you refactor the two computations into a function, to avoid duplication?
And don't you need to update logdet_sigma_ also?

Copy link
Contributor Author

@albertcthomas albertcthomas Nov 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And don't you need to update logdet_sigma_ also?

Indeed and rmse_ as well. Thanks!

@TomDLT
Copy link
Member

TomDLT commented Nov 15, 2018

You'll need a whats new entry.

@amueller amueller added this to the 0.21 milestone Nov 15, 2018
@albertcthomas
Copy link
Contributor Author

Thanks for the review @TomDLT.

@albertcthomas
Copy link
Contributor Author

albertcthomas commented Nov 19, 2018

@TomDLT you put the PR number instead of the issue number in whatsnew.

@jnothman
Copy link
Member

you put the PR number instead of the issue number in whatsnew.

We usually do.

@albertcthomas
Copy link
Contributor Author

Ah ok it makes sense actually :). thanks @TomDLT

@jnothman jnothman merged commit 0eda10a into scikit-learn:master Jan 10, 2019
@jnothman
Copy link
Member

Thanks for the ping @auvipy!

@auvipy
Copy link

auvipy commented Jan 10, 2019

wc!!

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019
koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants