Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@mabelvj
Copy link
Contributor

@mabelvj mabelvj commented Jan 3, 2018

fixes #10393

  • [Fix bug]: added support for float alphas in class _RidgeGCV(LinearModel), lines 1050 and 1052.
  • [Test]: added a test to check that using negative alphas does not raise an error.

@mabelvj mabelvj changed the title [WIP] Fixes #10393 Fixed error when fitting RidgeCV with negative alpha [WIP] Fixes #10393 Fixed error when fitting RidgeCV with negative alphas Jan 3, 2018
@amueller
Copy link
Member

amueller commented Jan 3, 2018

negative alphas should raise an error, right?

@mabelvj mabelvj changed the title [WIP] Fixes #10393 Fixed error when fitting RidgeCV with negative alphas [WIP] Fixes #10393 Fixed error when fitting RidgeCV with integers Jan 3, 2018
@mabelvj
Copy link
Contributor Author

mabelvj commented Jan 3, 2018

Yes! Sorry, I got confused and thought it should not raise an error. It's fixed now, testing negative and positive alphas both integers and float.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use Fixes #10393 in the PR description, rather than something ad hoc like #Fixes issue #10393 so that GitHub knows to close the issue automatically when this is merged.

A first glance:

error = scorer is None

for i, alpha in enumerate(self.alphas):
if float(alpha) < 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the float cast here...

ridge = RidgeCV(alphas)
assert_raises(ValueError, ridge.fit, X, y)

# Positive alphas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to test for positive alphas which are float. I think that we already are doing so in all the tests, isn't it?

ridge = RidgeCV(alphas)
ridge.fit(X, y)

# Negative integers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would separate the tests for negative alphas since that they should raise error.
You can make a test called test_ridgecv_neg_alphas() with a parametrize pytest for the integer and floating type.

decimal=6)


def test_ridgecv_alphas():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename test_ridgecv_int_alphas



def test_ridgecv_alphas():
# Test that no error is raised when fitting RidgeCV
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this comment since that it is obvious from the renaming from the function


# Integers
alphas = (1, 10, 100)
ridge = RidgeCV(alphas)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put directly a list when instantiating: RidgeCV(alphas=[1, 10, 100]).
You could also make sure that a numpy array with integer is also converted. In this case use a parametrized test

@pytest.mark.parametrize(
     "alphas",
    [(np.array([1, 10, 100])),
     ([1, 10, 100])])
def test_ridge_cv_alphas(alphas):
    X = ...
    y = ...
    ridge = RidgeCV(alphas)
    ridge.fit(X, y)

@glemaitre
Copy link
Member

Also I would make the conversion directly from __init__:

https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/ridge.py#L886

alphas = np.asarray(alphas, dtype=np.float64)

error = scorer is None

for i, alpha in enumerate(self.alphas):
if alpha < 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checking needs to be done outside from the loop. Otherwise we start to compute some stuff to actually break it at the end.

So something like:

if np.any(alphas < 0):
    raise ValueError("alphas cannot be negative. Got {} containing some negative value instead.")

# Negative integers
alphas = (-1, -10, -100)
ridge = RidgeCV(alphas)
assert_raises(ValueError, ridge.fit, X, y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to use assert_raises_regex to match the string

@mabelvj mabelvj force-pushed the FIXES_issue_10393_integers_in_RidgeCV_alpha branch from d93e99c to 2f71665 Compare January 17, 2018 15:55
@mabelvj
Copy link
Contributor Author

mabelvj commented Jan 30, 2018

Hi! Is it ok now?

@jnothman
Copy link
Member

If you think the work is complete, please change WIP in the title to MRG

]

@pytest.mark.parametrize("alpha_input, alpha_expected", testdata_alpha)
def test_conversion(alpha_input, alpha_expected):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is never executed.

decimal=6)


def test_ridgecv_alpha_conversion_to_array():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove this line and dedent the rest of this function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I'm new to tests in python. Already fixed it.

@mabelvj mabelvj changed the title [WIP] Fixes #10393 Fixed error when fitting RidgeCV with integers [MRG] Fixes #10393 Fixed error when fitting RidgeCV with integers Jan 30, 2018
Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks


@pytest.mark.parametrize("alpha_input, alpha_expected", testdata_alpha)
def test_conversion(alpha_input, alpha_expected):
assert((RidgeCV(alpha_input).get_params()['alphas'] ==
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually sure what this is trying to test. Is it trying to test that the input is validated and turned into floats before fit? We don't usually do this, because the user may also set them with set_params.

I also don't think this is currently asserting that the alphas are floats, only that they are unchanged or equivalent.

And I think we have common tests which do that. I'm short, I don't think this test adds anything in its current form.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'll remove it, just put it as a suggestion from the other reviewer. At least I've learned how these tests work.

normalize=False, random_state=None, solver='auto', tol=0.001)
"""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid introducing unnecessary and unrelated changes like this. It makes it hard to review your work, and may introduce merge conflicts for other changes in the works.

cv=None, gcv_mode=None,
store_cv_values=False):
self.alphas = alphas
self.alphas = np.asarray(alphas, dtype=np.float64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we do not alter parameters in __init__, because they can also be set in other ways. We delay all validation until fit (except in old code)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my mistake, an error of the file. changed again

@mabelvj mabelvj force-pushed the FIXES_issue_10393_integers_in_RidgeCV_alpha branch from b6dc752 to 4444390 Compare January 30, 2018 21:26
alpha_expected).all())


def test_ridgecv_int_alphas():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you remove this?

cv=None, gcv_mode=None,
store_cv_values=False):
self.alphas = np.asarray(alphas, dtype=np.float64)
self.alphas = np.asarray(alphas)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see that this was already done in master. These days we would avoid such validation.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flake8 error.

Otherwise LGTM

@jnothman
Copy link
Member

Please add an entry to the change log under Bug Fixes at doc/whats_new/v0.20.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

@jnothman jnothman changed the title [MRG] Fixes #10393 Fixed error when fitting RidgeCV with integers [MRG+1] Fixes #10393 Fixed error when fitting RidgeCV with integers Feb 8, 2018
@mabelvj mabelvj force-pushed the FIXES_issue_10393_integers_in_RidgeCV_alpha branch 2 times, most recently from f1b47c0 to 4f9d213 Compare February 8, 2018 18:32

- :class:`decomposition.IncrementalPCA` in Python 2 (bug fix)
- :class:`isotonic.IsotonicRegression` (bug fix)
- :class:`linear_model.ARDRegression` (bug fix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You seem to have removed some text from v0.20.rst, probably without realising. Please look at your diff and re-add the text you remove.

DENSE_FILTER = lambda X: X
SPARSE_FILTER = lambda X: sp.csr_matrix(X)

def DENSE_FILTER(X): return X
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid changes that are not related to your PR. It makes the review less pleasant for everyone involved. Can you put back the lambdas?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I don't know why this keeps happening. I had already reverted those changes.

@mabelvj mabelvj force-pushed the FIXES_issue_10393_integers_in_RidgeCV_alpha branch from 0049219 to 9b4a319 Compare February 15, 2018 23:44
@mabelvj
Copy link
Contributor Author

mabelvj commented Feb 15, 2018

I'm sorry for all the mess, I'm new to open source and did not know how to deal with remote changes and do the pull --rebase, that's why some the parts got removed. I updated the documentation adding my line and then reverted again the issue with lambdas.

@mabelvj mabelvj force-pushed the FIXES_issue_10393_integers_in_RidgeCV_alpha branch from 9b4a319 to e3a6d72 Compare February 16, 2018 00:11
Copy link
Member

@qinhanmin2014 qinhanmin2014 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, @mabelvj please try to avoid unrelevant changes (there's still some extra blank lines). Also, please try to fill current line before starting a new line.

overridden when using parameter ``copy_X=True`` and ``check_input=False``.
:issue:`10581` by :user:`Yacine Mazari <ymazari>`.

- Fixed a bug in :class:`linear_model.RidgeCV` where using negative integer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by this? negative integer -> integer since the bug is mainly about unexpected error when using integer alpha? (negative integer will be rejected right)

Copy link
Contributor Author

@mabelvj mabelvj Mar 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, both integers raise error.


- Add test :func:`estimator_checks.check_methods_subset_invariance` to check
that estimators methods are invariant if applied to a data subset.
:issue:`10420` by :user:`Jonathan Ohayon <Johayon>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to get rid of this strange diff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, the file in the master has already that line. Should I remove it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you actually don't change anything and find it hard to get rid of it, you might just keep it. (Hope there won't be some strange things when merging)

cv=None, gcv_mode=None,
store_cv_values=False):
self.alphas = alphas
self.alphas = np.asarray(alphas)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doing so?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was suggested to make that changes a few lines above: to add a conversion in the __init__ and then remove the float. It's done in the init of _RidgeGCV.

"alphas cannot be negative.",
ridge.fit, X, y)

# Negative alphas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test is redundant. We don't need too much tests for such a minor issue.

Copy link
Contributor Author

@mabelvj mabelvj Mar 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added that test because the initial error stated: ValueError: Integers to negative integer powers are not allowed. So I had to add a line to raise an error for negative alphas and in the tests I was testing it worked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't persuade me here but I won't focus too much on that.
I just think such a minor thing doesn't deserve so many tests.

@qinhanmin2014
Copy link
Member

@mabelvj Thanks for the explanation. I don't think I'll focus too much on these minor things, so please:
(1) resolved the conflict
(2) avoid all unrelevant changes (please double check your diff here)
((3) better try to fill current line before starting a new line and remove some unnecessary blank line)
I think it's very close from being merged.

@mabelvj mabelvj force-pushed the FIXES_issue_10393_integers_in_RidgeCV_alpha branch 2 times, most recently from ec918fa to e3a6d72 Compare March 8, 2018 13:58
@mabelvj mabelvj force-pushed the FIXES_issue_10393_integers_in_RidgeCV_alpha branch from e3a6d72 to 7ce9ba4 Compare March 8, 2018 14:34
Copy link
Member

@qinhanmin2014 qinhanmin2014 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I've pushed some minor change about the format.

@qinhanmin2014 qinhanmin2014 merged commit 03dd287 into scikit-learn:master Mar 8, 2018
@qinhanmin2014
Copy link
Member

Thanks @mabelvj :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

integers in RidgeCV alpha

6 participants