[MRG+1] Fix Bayesian ridge regression #12174

albertcthomas · 2018-09-26T17:15:58Z

Reference Issues/PRs

This PR is taking over PR #11294 (duplicate of PR #10751) which is fixing the sign of log(det(Sigma)) in the computation of the marginal log likelihood in BayesianRidge.
Fixes issue #10748

What does this implement/fix? Explain your changes.

In order to implement a non regression test for the score I took some time to understand the code and I found some issues about the returned attributes and the doc.

The references given in the doc and the docstring are 'Bayesian interpolation' (MacKay, 1992) and slides from a lecture which seems to follow 'Pattern Recognition and Machine learning' by Bishop. However the scikit-learn implementation of Bayesian ridge regression (using Gamma distribution for the hyperpriors) is just Automatic Relevance determination with the same hyperparameter lambda for all coordinates of w. Hence IMO the best reference to understand the code (and the one that I found to be the most helpful) is the one given for Automatic Relevance determination, i.e. , 'Sparse Bayesian Learning and the Relevance Vector Machine' by Tipping. So I would add this reference to the docstring and the doc of Bayesian ridge regression.
The docstring of the scores_ attribute reads : value of the objective function (to be maximized). I think we should be clearer about what the objective function is here, i.e., the marginal log likelihood and maybe give the formula in the doc (equation (36) in the paper by Tipping) to be clear about what is done in the scikit learn implementation

The returned coef_ attribute does not correspond to the last update of alpha_ and lambda_ whereas the sigma_ matrix is the one corresponding to the last update. I think this is a bug for coef_ as when doing prediction the coef_ attribute should be the one updated thanks to the last update values of alpha_ and lambda_ (see equation (19), (21) and (22) in Tipping):

Besides it was decided to return the alpha_ and lambda_ parameters corresponding to the (non updated) coef_ attribute and not the last updates of alpha_ and lambda_ in issue #8224. I think this is wrong as we should also return the last updates because there are the ones we should use in the posterior. If it is agreed that we should update coef_ as explained above, the last updates of alpha_ and lambda_ will correspond to the updated coef_ value.

The first test in test_bayes is skipped because it is broken and I think we should fix it or remove it but I am not certain about the correct output yet, maybe @agramfort can help for this one? It checks that the scores are increasing.
Fix sign of logdet Sigma in the score (previous PRs)
improve reading of the code: variable sigma is now named scaled_sigma and update some of the comments
Non regression tests
Update doc and refs
Remove skipped increasing scores test.
Fix coef_, lambda_ and alpha_ attributes.

Closes #11294 #10751 #10748

agramfort · 2018-09-26T19:18:22Z

woooottt I have not looked at this code for the last 8 years :) (git blame will confirm...)

if the skipped test now passes with your changes then you have a non-regression test...

thx @albertcthomas for looking into this

albertcthomas · 2018-09-27T15:10:11Z

The scikit-learn implementation uses the MacKay updates and according to section 2.2 in A New View of Automatic Relevance Determination (Wipf and Nagarajan, 2008) "MacKay updates do not even guarantee cost function decrease" so the skipped test should be removed.

agramfort · 2018-09-28T08:01:48Z

ok can you just insert comments in the code to say this? thx

…efs in docstring

albertcthomas · 2018-09-28T12:16:11Z

Ok I removed the skipped test, added details to the docstring about the implementation and how the parameters are updated, and fixed the output values for coef_, alpha_ and lambda_. It should now be possible to add a non regression test for the score computation (main motivation of this PR :)). Will do that soon, as well as adding a few more details in the doc.

albertcthomas · 2018-10-01T13:54:11Z

This PR is ready for reviews. I added a non regression test: test_bayesian_ridge_score_values.py. The test passes with this PR. When run on master the test returns

AssertionError:
Arrays are not almost equal to 9 decimals
 ACTUAL: -2901.7989775709366
 DESIRED: -3410.6536570484113

agramfort

besides my nitpick LGTM

thx @albertcthomas

agramfort · 2018-10-07T16:06:56Z

sklearn/linear_model/bayes.py

    -----
-    For an example, see :ref:`examples/linear_model/plot_bayesian_ridge.py
-    <sphx_glr_auto_examples_linear_model_plot_bayesian_ridge.py>`.
+    There exist several strategies to perform Bayesian ridge regression. This


exist -> exists

I think 'exist' is correct here.

albertcthomas · 2018-10-08T07:33:16Z

Thanks for the review @agramfort

TomDLT · 2018-11-15T13:14:33Z

sklearn/linear_model/bayes.py

+        # return regularization parameters
+        self.alpha_ = alpha_
+        self.lambda_ = lambda_
+        # and corresponding posterior mean and posterior covariance


Here you recompute the posterior mean and posterior covariance, exactly as in the for-loop.
Could you refactor the two computations into a function, to avoid duplication?
And don't you need to update logdet_sigma_ also?

And don't you need to update logdet_sigma_ also?

Indeed and rmse_ as well. Thanks!

TomDLT · 2018-11-15T13:17:53Z

You'll need a whats new entry.

albertcthomas · 2018-11-18T08:51:07Z

Thanks for the review @TomDLT.

albertcthomas · 2018-11-19T10:33:44Z

@TomDLT you put the PR number instead of the issue number in whatsnew.

jnothman · 2018-11-19T10:47:19Z

you put the PR number instead of the issue number in whatsnew.

We usually do.

albertcthomas · 2018-11-19T14:34:10Z

Ah ok it makes sense actually :). thanks @TomDLT

jnothman · 2019-01-10T10:38:04Z

Thanks for the ping @auvipy!

auvipy · 2019-01-10T10:47:52Z

wc!!

…cikit-learn#12174)

…update (scikit-learn#12174)" This reverts commit 1bdbbbd.

…cikit-learn#12174)

bartz and others added 4 commits September 22, 2018 11:48

Fixes sign of logdet_sigma term in BayesianRidge.fit

c3938a0

improve reading of the code

94a9472

sc in score

40a5b10

comment about gamma value

08c882d

albertcthomas added 2 commits September 28, 2018 14:00

remove skipped test because not well founded + add explanations and r…

b15597c

…efs in docstring

fix alpha, lambda and coef output values

472d135

change score position

3d964ea

albertcthomas force-pushed the fix_sign_bayes_ridge branch from b84a15e to 3d964ea Compare September 29, 2018 13:15

albertcthomas added 4 commits October 1, 2018 13:59

refactor computation of score

b1c0244

add test for score and add a check for n_iter value

5d57296

clarify doc and add ref

6552e88

delete newlines

3b5fae4

albertcthomas changed the title ~~[WIP] Bayesian ridge regression~~ [MRG] Bayesian ridge regression Oct 1, 2018

albertcthomas changed the title ~~[MRG] Bayesian ridge regression~~ [MRG] Fix Bayesian ridge regression Oct 1, 2018

fix doc

09780ca

agramfort approved these changes Oct 7, 2018

View reviewed changes

agramfort changed the title ~~[MRG] Fix Bayesian ridge regression~~ [MRG+1] Fix Bayesian ridge regression Oct 7, 2018

albertcthomas mentioned this pull request Oct 8, 2018

Fixes sign of logdet_sigma term in BayesianRidge.fit #11294

Closed

TomDLT reviewed Nov 15, 2018

View reviewed changes

amueller added this to the 0.21 milestone Nov 15, 2018

refactor code and fix last score

8ea2eaf

whatsnew entry

f95ac9f

Change PR number in whats new

0daae57

TomDLT approved these changes Nov 19, 2018

View reviewed changes

auvipy approved these changes Jan 10, 2019

View reviewed changes

Merge branch 'master' into fix_sign_bayes_ridge

a3c3426

jnothman merged commit 0eda10a into scikit-learn:master Jan 10, 2019

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

FIX Bayesian ridge regression: returned values to match last update (s…

1bdbbbd

…cikit-learn#12174)

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "FIX Bayesian ridge regression: returned values to match last …

ea578e9

…update (scikit-learn#12174)" This reverts commit 1bdbbbd.

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "FIX Bayesian ridge regression: returned values to match last …

1ba1aed

…update (scikit-learn#12174)" This reverts commit 1bdbbbd.

koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019

FIX Bayesian ridge regression: returned values to match last update (s…

58dc762

…cikit-learn#12174)

This was referenced Aug 5, 2019

Changed the sign before 'logdet_sigma_' in computing score 's' #10751

Closed

Equation for scores_ in BayesianRidge seems incorrect. #10748

Closed

Uh oh!

[MRG+1] Fix Bayesian ridge regression #12174

[MRG+1] Fix Bayesian ridge regression #12174

Uh oh!

Conversation

albertcthomas commented Sep 26, 2018 • edited by TomDLT Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Uh oh!

agramfort commented Sep 26, 2018

Uh oh!

albertcthomas commented Sep 27, 2018

Uh oh!

agramfort commented Sep 28, 2018 via email

Uh oh!

albertcthomas commented Sep 28, 2018

Uh oh!

albertcthomas commented Oct 1, 2018

Uh oh!

agramfort left a comment

Choose a reason for hiding this comment

Uh oh!

agramfort Oct 7, 2018

Choose a reason for hiding this comment

Uh oh!

albertcthomas Oct 8, 2018

Choose a reason for hiding this comment

Uh oh!

albertcthomas commented Oct 8, 2018

Uh oh!

TomDLT Nov 15, 2018

Choose a reason for hiding this comment

Uh oh!

albertcthomas Nov 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomDLT commented Nov 15, 2018

Uh oh!

albertcthomas commented Nov 18, 2018

Uh oh!

albertcthomas commented Nov 19, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Nov 19, 2018

Uh oh!

albertcthomas commented Nov 19, 2018

Uh oh!

jnothman commented Jan 10, 2019

Uh oh!

auvipy commented Jan 10, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

albertcthomas commented Sep 26, 2018 •

edited by TomDLT

Loading

albertcthomas Nov 18, 2018 •

edited

Loading

albertcthomas commented Nov 19, 2018 •

edited

Loading