Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[MRG] DOC Instructions for changing default value of a certain parameter #11469

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jul 16, 2018
Merged

[MRG] DOC Instructions for changing default value of a certain parameter #11469

merged 10 commits into from
Jul 16, 2018

Conversation

qinhanmin2014
Copy link
Member

@qinhanmin2014 qinhanmin2014 commented Jul 10, 2018

Closes #11219 Closes #11283
We have several PRs (marked as blocker) which aim to change the default value of a certain parameter, but seems that contributors don't know our way of doing these things.
Currently, I only write down our consensus (I think). Whether we should take care of tests/examples when changing the default value of a parameter is still unclear.
See #11043 (comment)
cc @jnothman

@qinhanmin2014 qinhanmin2014 changed the title DOC Instructions for changing default value of a certain parameter [MRG] DOC Instructions for changing default value of a certain parameter Jul 10, 2018
@rth
Copy link
Member

rth commented Jul 10, 2018

Thanks @qinhanmin2014 ! I was also not certain about the recommended way.

From #11043 (comment),

I think there's consensus that we should give the parameter a new default value and we're going to recommend deprecated instead of None

Would you remember where this was discussed? I looked into some of the linked issues: most appear to have used either None or something else instead of "deprecated", but I probably missed something :)

@qinhanmin2014
Copy link
Member Author

@rth Here is the comment from @jnothman
#11283 (comment)
I'm aware that few (no?) existing PR is using deprecated, so seems that I shouldn't regard it as a consensus now :)

@rth
Copy link
Member

rth commented Jul 10, 2018

Got it, thanks. What about FutureWarning vs DeprecationWarning, is there any references on it? Currently the latter seems to be more used than the former. Is it about DeprecationWarning being ignored by default?

From https://docs.python.org/3.7/library/warnings.html#warning-categories

DeprecationWarning | Base category for warnings about deprecated features when those warnings are intended for other Python developers (ignored by default, unless triggered by code in main).
FutureWarning | Base category for warnings about deprecated features when those warnings are intended for end users of applications that are written in Python.
PendingDeprecationWarning | Base category for warnings about features that will be deprecated in the future (ignored by default).

Changed in version 3.7: Previously DeprecationWarning and FutureWarning were distinguished based on whether a feature was being removed entirely or changing its behaviour. They are now distinguished based on their intended audience and the way they’re handled by the default warnings filters.

Hmm, this does suggest to use FeatureWarning for parameter renaming if I read this correctly?

@rth
Copy link
Member

rth commented Jul 10, 2018

Also from the above description, I'm not sure if scikit-learn users belong to "other Python developers" or "end users of applications that are written in Python" category.

@qinhanmin2014
Copy link
Member Author

(1) I had to admit that I used to distinguish these two warnings through their names (which happens to be similar to what's recommended before python 3.7), so thanks for the materials.
(2) Some related discussions are splitted in relevant PRs, e.g.,
#10331 (comment)
Seems that FutureWarning is accepted more often? (heading to bed, sorry for the lack of details)
(3) The new definition of these warnings seems really hard to understand from my side. What's the difference between end users and python developers? Are scikit-learn users end users or python developers?

@rth
Copy link
Member

rth commented Jul 10, 2018

(also ping @lesteve who commented about this in linked issues)

@lesteve
Copy link
Member

lesteve commented Jul 10, 2018

I have to say I am not sure there is a generic way of doing this so this is a bit tricky to document.

For example IMO:

  • FutureWarning makes sense if we want to change the default value while the old default value stays valid
  • DeprecationWarning makes sense if we want to change the default value and get rid of the old default value at the same time.

I even seem to remember that there have been PRs where we combined both (in a first step FutureWarning to warn that the default value is going to change, with a promise that once the default value changes to add a DeprecationWarning on the old default value when set explicitly). IMO this one is a bit too conservative.

Here is what I would propose:

  • maybe document both options. If I had to chose only one, I would do the one with FutureWarning which I find more widely applicable. In the FutureWarning case, I reckon 'deprecated' is not a good name in this case.
  • maybe add a few words to explain why we can not just change the default value without warning

Maybe it would be good to list a few recent PRs (sorry if this has been done already somewhere) that did change the default value so we can have a closer look about what our general approach is "in the wild".

@qinhanmin2014
Copy link
Member Author

Thanks all for the instant reply. It's hard to figure our a good way but I think we should figure out an acceptable way and provide it to contributors ASAP.
(1) Regarding what to use as the new default, @jnothman @amueller and me vote +1 for 'deprecated', @rth vote +0 (right?), @lesteve vote -1, #9379 uses 'warn', #10331 uses 'deprecated', #11043 uses None, so I still use 'deprecated'. I think it's reasonable to provide users with multiple choices here.
(2) Regarding FutureWarning or DeprecationWarning, I doubt whether it's good to provide users with multiple choices here. I vote FutureWarning (@lesteve vote it in #10331, @TomDLT vote it in #9997 or maybe you've changed your mind). #10331 and #11043 use FutureWarning. (#9379 uses DeprecationWarning because the parameter will eventually be deprecated) According to the definition before python 3.7, seems that FutureWarning is the right choice. According to the definition after python 3.7, seems that FutureWarning is still the right choice (DeprecationWarning will be ignored by default)?
I update some sample PRs and reason for the deprecation cycle.
Marking it as 0.20 and blocker.

@@ -796,11 +796,28 @@ In the following example, k is deprecated and renamed to n_clusters::
"will be removed in 0.15.", DeprecationWarning)
n_clusters = k

If the default value of a parameter needs to be changed, it is recommended to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would something else than recommended if we actually impose it :)

@glemaitre
Copy link
Member

maybe document both options. If I had to chose only one, I would do the one with FutureWarning which I find more widely applicable. In the FutureWarning case, I reckon 'deprecated' is not a good name in this case.

I would go with this approach but I am fine with anything chosen. I don't think that this is actually a blocker until that we agree on how do it.

maybe add a few words to explain why we can not just change the default value without warning

The PR LGTM apart of the above comment which I think could be addressed in the narrative as well.

@amueller
Copy link
Member

I think we have used deprecation warnings for changing defaults in the past but maybe FutureWarning is better in this case.
Maybe 'warn' is good, or alternatively if a string is allowed use a sentinel.

raise ``FutureWarning`` when users are using the default value. In the
following example, we change the default value of ``n_clusters`` from 5 to 10
(current version is 0.20). You can also refer to recent merged PRs
(e.g., `#10331 <https://github.com/scikit-learn/scikit-learn/pull/10331>`__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they will not be recent for long. maybe remove these and instead say everything important here?

warnings.warn("The default value of n_clusters will change from "
"5 to 10 in 0.22.", FutureWarning)
n_clusters = 5

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should also be an example for an __init__ parameter in a class to show that the deprecation needs to happen in ``fit..

@qinhanmin2014
Copy link
Member Author

Thanks all for the instant reply. I agree that it's hard to figure out a perfect solution, but I think we should provide an acceptable solution to users ASAP.
(1) After going through your comments, I think FutureWarning & warn as the new default might be an acceptable solution. I still doubt whether it's appropriate to allow users to use either FutureWarning or DeprecationWarning. If someone insists, I'll update accordingly.
(2) @amueller Regarding the example, the examples in this section are all based on functions, not classes. I can't figure out a way to provide a simple example based on classes. Also, we've provided some sample PRs about changing default value of a parameter in classes. Need more detailed instructions here :)

@qinhanmin2014
Copy link
Member Author

Thanks @glemaitre for the great improvements :) These changes LGTM from my side.
ping @jnothman @amueller @rth @lesteve for a review if you have time :)

Copy link
Member

@TomDLT TomDLT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

def example_function(n_clusters=8, k=None):
if k is not None:
def example_function(n_clusters=8, k='deprecated'):
if k != 'deprecated':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking with @lesteve, it is true that this line look really weird in english:
"if k is not deprecated then warn for deprecation"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glemaitre @lesteve
I think deprecated, along with warn can be regarded as symbols here. k != 'deprecated' can be regarded as : if you use the deprecated parameter, then we 'll raise DeprecationWarning.
Revert back to None if you don't like it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think warn makes sense for changed behaviors, because then it's

if value == "warn":
   # do the warning

as below. we could have "not specified" for removing a parameter as a symbol.

if value != "not specified":
   # do the warning

That's less clear in the signature, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "not used" would be better?

Copy link
Member

@amueller amueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are great instructions!

Two things are missing, though:
A deprecation requires a test that the warning is raised in the relevant cases and not in the other cases, AND the deprecation warning should be caught in all other tests, and there should be no warning in the examples. Maybe that removes a note box or some other emphasis?

warnings.warn("'k' was renamed to n_clusters in version 0.13 and "
"will be removed in 0.15.", DeprecationWarning)
n_clusters = k

When the change is in a class, we conduct validate and raise warning in ``fit``::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either "we validate" or "we conduct validation" thought I'd prefer the first.


import warnings

class ExampleEstimator:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inherit from BaseEstimator maybe? I mean not necessary but why not?

def example_function(n_clusters=8, k=None):
if k is not None:
def example_function(n_clusters=8, k='deprecated'):
if k != 'deprecated':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "not used" would be better?

``FutureWarning`` when users are using the default value. In the following
example, we change the default value of ``n_clusters`` from 5 to 10
(current version is 0.20). You can also refer to merged PRs
(e.g., `#10331 <https://github.com/scikit-learn/scikit-learn/pull/10331>`__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these add anything? I mean I'm not super opposed but I'm not sure what's in the PR that's not here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the PR references. I imagine the discussion is a bit chaotic compared to the guidelines we are trying to provide.

@amueller
Copy link
Member

this makes me so happy :)

@qinhanmin2014
Copy link
Member Author

@amueller Thanks :) comments addressed.

do these add anything? I mean I'm not super opposed but I'm not sure what's in the PR that's not here.

Originally, I intend to show users how to do the deprecation in a function through an example, and show users how to do the deprecation in a class through merged PRs. Now, we have an example for a class, so they don't add anything. But I think it's still useful to tell users to refer to merged PRs.

A deprecation requires a test that the warning is raised in the relevant cases and not in the other cases, AND the deprecation warning should be caught in all other tests, and there should be no warning in the examples. Maybe that removes a note box or some other emphasis?

I agree and add it. But I suggest that you check my comment #11043 (comment). Currently, we fail to do so.

@amueller
Copy link
Member

Yes, I know, we're in a very bad state right now wrt deprecation warnings :-/ One more reason to emphasize that.

@@ -790,17 +790,66 @@ In the following example, k is deprecated and renamed to n_clusters::

import warnings

def example_function(n_clusters=8, k=None):
if k is not None:
def example_function(n_clusters=8, k='not_used'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should wait and see if others think if this is a good idea. @jnothman @ogrisel @rth ?

@glemaitre
Copy link
Member

LGTM

@amueller
Copy link
Member

sorry for nagging again, but maybe it would be even better to say how to catch the deprecation warnings. Because I'm not actually sure what's the right way to do that and to test that the right warnings are raised ;)

@glemaitre
Copy link
Member

sorry for nagging again, but maybe it would be even better to say how to catch the deprecation warnings. Because I'm not actually sure what's the right way to do that and to test that the right warnings are raised ;)

I see that it could be useful but more for user than for contributor, isn't it. In this section we try to explain how to raise it while the catching of the warning should be in another section accessible to all user, IMO.

@amueller
Copy link
Member

As discussed in person: no, if you make a deprecation you should make sure all the warnings you expect are caught and there are no warnings left.
See the example of #11537 for why I think this is important (this was the second deprecation warning I was trying to fix from the current test, only like 200 more to go).

@glemaitre
Copy link
Member

Basically, it seems that pytest.mark.filterwarnings(ignore:something deprecated) could be used. It will avoid printing the warning but it does not avoid to raise them.

@glemaitre
Copy link
Member

It will avoid printing the warning but it does not avoid to raise them.

Actually it does not matter. When we want to assert warnings/no warnings, we actually do not want to filter them.

@lesteve
Copy link
Member

lesteve commented Jul 15, 2018

Basically, it seems that pytest.mark.filterwarnings(ignore:something deprecated) could be used. It will avoid printing the warning but it does not avoid to raise them.

I think at the moment there are two ways to ignore warnings sklearn.utils.testing.ignore_warnings (our own) vs pytest.mark.filter_warnings (pytest). Personally I would be in favour of recommending pytest.mark.filter_warnings if we can.

@lesteve
Copy link
Member

lesteve commented Jul 15, 2018

Personally I would be in favour of recommending pytest.mark.filter_warnings if we can.

The caveat is that sometimes you can not use pytest.mark.filter_warnings e.g. in sklearn/utils/estimator_checks.py

@qinhanmin2014
Copy link
Member Author

DOC add subsection

Thanks. Since we have two subsections now, I also summarize the TODOs in the new section.

but maybe it would be even better to say how to catch the deprecation warnings.

I'm going to follow @lesteve & @glemaitre and recommend pytest.mark.filter_warnings. I think it can be used in most scenario.

Copy link
Contributor

@albertcthomas albertcthomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small typo otherwise LGTM. Thanks @qinhanmin2014! This will be very useful.


Similar to deprecations, the warning message should always give both the
version in which the change happened and the version in which the old behavior
will be removed. The docstring needs to updated accordingly. We need a test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be updated

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @albertcthomas :) Corrected.

@glemaitre glemaitre merged commit 2716c8e into scikit-learn:master Jul 16, 2018
@glemaitre
Copy link
Member

Merging since that we have 4 reviewers who approved this PR

will be removed. The docstring needs to be updated accordingly. We need a test
which ensures that the warning is raised in relevant cases but not in other
cases. The warning should be caught in all other tests
(using e.g., ``pytest.mark.filter_warnings``), and there should be no warning
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glemaitre Should we use @pytest.mark.filterwarnings here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aboucaud I've pushed to master, please refer the latest doc :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but you forgot to remove the _ :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, thanks a lot @aboucaud :)

@glemaitre
Copy link
Member

glemaitre commented Jul 16, 2018 via email

@jnothman
Copy link
Member

jnothman commented Jul 17, 2018 via email

@qinhanmin2014 qinhanmin2014 deleted the change-default-guide branch July 17, 2018 02:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants