ENH allows checks generator to be pluggable #18750

glemaitre · 2020-11-03T10:44:12Z

This PR allows passing a third-party generator that yields custom checks.

In imbalanced-learn, we reimplement the infrastructure developed in scikit-learn just to overwrite _yield_all_checks with our own generator. It would be more friendly to allow plugging any generator.

However, we should mention that the signature of the check functions can be changed at any time.

glemaitre · 2020-11-03T10:50:38Z

ping @rth @ogrisel @NicolasHug @thomasjpfan

NicolasHug

For now I'm a bit concerned about the tradeoff between complexity and usefulness.

This will be mostly useful to libraries that implement lots of scikit-learn compatible estimators and that want to reuse our check infrastructure (which is quite unstable). I don't have the entire landscape in mind but I'm not sure this will be used by any other lib except for imbalance learn?

NicolasHug · 2020-11-03T10:52:41Z

doc/developers/develop.rst

+purpose. This parameter is a generator that yield callables such as::
+
+
+    def check_estimator_has_fit(name, instance, strict_mode=True):


this will be api_only

Yep but we need to merge your PR first.

glemaitre · 2020-11-03T11:43:06Z

I would say that it can be enlarged to scikit-learn-contrib packages but I agree this is limited.

rth · 2020-11-03T12:54:45Z

So if I understand correctly, the use case is that you have some additional estimator checks that you would like to run in imbalanced-learn and you don't want to re-implement parametrize_with_checks (i.e. handling of xfail, skip via estimator tags, etc) ?

Having more flexibility in check generation wouldn't hurt I think. For instance, currently in tslearn common checks are monkeypatched (#14057 (comment)) mostly to address the difference in input types, but I guess this could also be useful maybe @rtavenar? As there are common checks that would make for time series input that we don't include in scikit-learn.

Generally even if it only helps @glemaitre and @chkoar maintaining imbalanced-learn and doesn't have much cost for us, and we are clear that this is experimental (I wouldn't say it adds that much complexity) I would still be +1 for it. I think facilitating maintenance of projects in scikit-learn-contrib is something that we want to do, when possible.

glemaitre · 2020-11-03T13:02:09Z

So if I understand correctly, the use case is that you have some additional estimator checks that you would like to run in imbalanced-learn and you don't want to re-implement parametrize_with_checks (i.e. handling of xfail, skip via estimator tags, etc) ?

exactly

NicolasHug · 2020-11-03T13:29:13Z

I wouldn't say it adds that much complexity

I kinda disagree here because this PR makes a quite strong assumption: it assumes that the checks come from a generator of callables (whose signatures are "experimental", but still), while that's supposed to be an implementation detail. When it's in, we can't really go back on this - and yet we might want to refactor our check framework someday, considering how difficult it is to extend (see e.g. the 4 prototype PRs for just adding 1 new "api_only" parameter).

~~I won't oppose~~ (EDIT: can't guarantee that anymore lol) but IMHO, and after having worked quite a bit on it recently, our checks framework isn't mature nor clean enough for us to "open ource" it.

NicolasHug · 2020-11-03T14:02:16Z

sklearn/utils/estimator_checks.py

+    if checks_generator is None:
+        checks_generator = _yield_all_checks
+
+    def _checks_generator():


Nit but we don't need the newly-added leading underscore here since there's no notion of private/public (it might simplify the diff also)

NicolasHug · 2020-11-03T14:04:17Z

Basically it makes me uncomfortable because this means that with this PR, estimator_checks.py becomes a public framework for arbitrary checks (much like pytest in a way), instead of being simply a specific suite of (semi private) checks + 2 public utilities to run them. I wouldn't mind if we were sure that we had something solid regarding the design of our entire framework, but we really don't.

For example, check_estimator(generate_only=True, checks_generator=...) is basically a no op and makes little sense. I feel like it indicates that the current design just isn't mature enough yet.

Also, what exactly needs to be re-written in downstream libraries? It seems to me that only check_estimator needs to be re-implemented (in order to substitute yield_all_checks for something else), not the entire infrastructure? You'd need to rely on private utilities, but the new parameter is experimental anyway so you don't have better guarantees there it seems

glemaitre · 2020-11-03T15:26:02Z

Also, what exactly needs to be re-written in downstream libraries?

For taking the advanatge of the tags, you need to take the way the xfail is working. So to not have anything breaking downstream, I would need to copy-paste the entire infrastructure.

At least, it will work. However, if I want to follow the progress of scikit-learn, then I just want to modify as little as possible of some scikit-learn utilities. In this case, I am importing some private things which at the end are going to break (which I cannot complain). The change could be from a couple of lines wihtin a function or just a private function that does not exist anymore.

That's why, I would rather deal with some private API changes instead of having to investigate internal change of code since the last release. However, I agree that in both cases, it is my problem and not the issue of scikit-learn since they are private changes.

NicolasHug · 2020-11-03T15:43:40Z

For taking the advantage of the tags, you need to take the way the xfail is working

Only for the xfail_checks tag, right? And for that it seems that all you need to import is _maybe_skip()? Or maybe I don't understand what you mean by "the xfail"

And worst case scenario, it's always possible to set sklearn.utils.estimator_checks._yield_all_checks at runtime? It's a hack but you would not need to copy paste anything

glemaitre · 2020-11-03T16:07:35Z

Only for the xfail_checks tag, right? And for that it seems that all you need to import is _maybe_skip()? Or maybe I don't understand what you mean by "the xfail"

We were importing:

from sklearn.utils.estimator_checks import _mark_xfail_checks
from sklearn.utils.estimator_checks import _set_check_estimator_ids

but these 2 functions do not exist anymore and have been replaced. So we need to change the import and modify our own parametrize_with_checks

And worst case scenario, it's always possible to set sklearn.utils.estimator_checks._yield_all_checks at runtime? It's a hack but you would not need to copy paste anything

Yes we could potentially monkey patch:

@parametrize_with_checks(
    list(_tested_estimators()), checks_generator=sklearn_yielder,
)
def test_estimators_compatibility_sklearn(estimator, check, request):
    _set_checking_parameters(estimator)
    check(estimator)

@parametrize_with_checks(
    list(_tested_estimators()), checks_generator=imblearn_yielder,
)
def test_estimators_imblearn(estimator, check, request):
    # Common tests for estimator instances
    with ignore_warnings(category=(FutureWarning,
                                   ConvergenceWarning,
                                   UserWarning, FutureWarning)):
        _set_checking_parameters(estimator)
        check(estimator)

So it means that we should be careful about test ordering while overwriting _yield_all_checks.
And I find it really sloppy and difficult to debug if something go wrong :)

adrinjalali

I really like this, we can even have different pre-defined generators for users to try, like all and api and ...

adrinjalali · 2020-11-05T20:16:48Z

doc/developers/develop.rst

+available in scikit-learn. It is common for a third-party library to extend
+the test suite with its own estimator checks.


I like your usage of "common" here :D :D

adrinjalali · 2020-11-05T20:18:32Z

sklearn/utils/estimator_checks.py

+        The generator yielding checks for the estimators. By default, the
+        common checks from scikit-learn will be yielded.
+
+        .. versionadded:: 0.24


also note experimental

adrinjalali · 2020-11-05T20:20:49Z

sklearn/utils/tests/test_estimator_checks.py

+
+def test_check_estimator_checks_generator():
+    # Check that we can pass a custom checks generator in `check_estimator`
+    assert_warns_message(


with pytest.warns(...):?

adrinjalali · 2020-11-05T20:21:32Z

sklearn/utils/tests/test_estimator_checks.py

+    test_estimator = decorator(test_estimator)
+    for _mark in test_estimator.pytestmark:
+        for estimator, check in _mark.args[1]:
+            assert_warns_message(


with pytest.warns?

NicolasHug · 2020-11-05T21:13:40Z

we can even have different pre-defined generators for users to try, like all and api and ...

That would be redundant with the api_only parameter, and we would support 2 ways of doing the same thing. This isn't good practice in terms of API design, and this is another sign that there's something wrong with this new parameter.

rtavenar · 2020-11-07T08:22:32Z

Having more flexibility in check generation wouldn't hurt I think. For instance, currently in tslearn common checks are monkeypatched (#14057 (comment)) mostly to address the difference in input types, but I guess this could also be useful maybe @rtavenar? As there are common checks that would make for time series input that we don't include in scikit-learn.

I agree. In our case, we are mostly interested in tuning the data that comes in the checks, but I guess there are some other uses cases where checks themselves should be tuned. I don't know which is the best technical solution for that but having a principled way to do it at some point would be great for downstream libraries.

adrinjalali · 2020-11-08T19:07:50Z

That would be redundant with the api_only parameter, and we would support 2 ways of doing the same thing. This isn't good practice in terms of API design, and this is another sign that there's something wrong with this new parameter.

Another way to look at it, is that api_only doesn't allow this usecase, but this new parameter allows us to also implement api_only (almost, except the tests which are half API and half not, which arguably should split into multiple tests).

NicolasHug · 2020-11-08T21:00:13Z

I agree with your comment @adrinjalali and I feel like it actually illustrates one of the numerous ways in which the check framework isn't mature enough ;) , in particular this part:

the tests which are half API and half not, which arguably should split into multiple tests

NicolasHug · 2020-11-08T21:00:25Z

Maybe we can meet halfway: is there a way we can implement all this via a completely private logic? Like, instead of adding a new parameter, can we instead implement a private way to "register" the generator to be used by check_estimator? Basically something cleaner and safer than manually setting sklearn.utils.estimator_checks._yield_all_checks.

This way, libraries can plug their own generators, but we keep this private so that we don't have to constrain ourselves in the future when we'll refactor our check logic (which IMHO will have to happen soonish if we want to keep expanding it efficiently. An entire reworking would likely be beneficial for all in the long term)

rth · 2020-11-08T21:31:53Z

Maybe we can meet halfway: is there a way we can implement all this via a completely private logic?

What would be the practical difference with an experimental checks_generator parameter as could be done in this PR? Either way there would be some users relying on that (hopefully not that many) and we would try avoid breaking this mechanism, unless we don't see another choice.

illustrates one of the numerous ways in which the check framework isn't mature enough ;)

Do you mean not mature in terms or implementation/maintenance or usage?
How would you propose making them more mature? Including feedback from contrib projects who are the only "users" of this interface (aside from scikit-learn) seem like a step in the right direction.

If the implementation complexity is a concern, one a bit radical step could to require pytest for common tests, and deprecate check_estimator in favor of parametrize_with_checks. Personally I have never run into a use case in contrib projects where using parametrize_with_checks wouldn't be perferrable. That way we spend less effort re-implementing and maintaining a testing framework.

NicolasHug · 2020-11-09T19:07:13Z

What would be the practical difference with an experimental checks_generator parameter as could be done in this PR?

In the current state of this PR, the parameter isn't experimental; only the signature of the check is noted as experimental. But we're stuck with the parameter once it's introduced. I'd be more comfortable if the whole parameter was experimental, but even more so if we kept everything private: those few that really need this will use it at their own risk, but we don't advertise it.

Do you mean not mature in terms or implementation/maintenance or usage?

Not mature in terms of implementation/maintenance. The usage/public API is fine as far as I can tell.

How would you propose making them more mature?

I've been submitting quite a bit of PRs to simplify the logic lately, while I was working on the api_only mode (which you've reviewed 🙏 ). But at this point I feel like the whole file could do with a big refactoring / entire re-writing. It was suggested in #18582 that checks should be re-written to be either all-API or non-API, but not both. This will be a lot of work so while we're at it, we might as well re-write the entire thing so that it fits our current needs, as well as those of third-party libraries, in a clean way.

We pass useless parameters to most checks: api_only is not always used by the checks, and the name is in general not used at all. In addition, the interactions with the xfail_checks tag is such that it's impossible to bypass a check based on its input. These are things that we could solve too if we were to re-write our check framework.

one a bit radical step could to require pytest for common tests

I'd be favorable to that, not relying on pytest has been an impediment for maintaining the check suite so far.

MNT allows checks generator to be pluggable

1fb9940

github-actions bot added the module:utils label Nov 3, 2020

iter

f7f8761

glemaitre changed the title ~~MNT allows checks generator to be pluggable~~ ENH allows checks generator to be pluggable Nov 3, 2020

NicolasHug reviewed Nov 3, 2020

View reviewed changes

TST add tests

84b1e48

glemaitre added 2 commits November 3, 2020 14:06

iter

c2970a4

iter

8537836

NicolasHug reviewed Nov 3, 2020

View reviewed changes

adrinjalali reviewed Nov 5, 2020

View reviewed changes

rth mentioned this pull request Nov 9, 2020

TST introduce _safe_tags for estimator not inheriting from BaseEstimator #18797

Merged

glemaitre mentioned this pull request Nov 10, 2020

Do we have a compelling reason to enforce tags? #18798

Open

NicolasHug mentioned this pull request Nov 23, 2020

[MRG] MNT Api only mode in check_estimator #18582

Closed

Base automatically changed from master to main January 22, 2021 10:53

		purpose. This parameter is a generator that yield callables such as::


		def check_estimator_has_fit(name, instance, strict_mode=True):

		available in scikit-learn. It is common for a third-party library to extend
		the test suite with its own estimator checks.

Uh oh!

ENH allows checks generator to be pluggable #18750

Are you sure you want to change the base?

ENH allows checks generator to be pluggable #18750

Uh oh!

Conversation

glemaitre commented Nov 3, 2020

Uh oh!

glemaitre commented Nov 3, 2020

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

NicolasHug Nov 3, 2020

Choose a reason for hiding this comment

Uh oh!

glemaitre Nov 3, 2020

Choose a reason for hiding this comment

Uh oh!

glemaitre commented Nov 3, 2020

Uh oh!

rth commented Nov 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glemaitre commented Nov 3, 2020

Uh oh!

NicolasHug commented Nov 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NicolasHug Nov 3, 2020

Choose a reason for hiding this comment

Uh oh!

NicolasHug commented Nov 3, 2020

Uh oh!

glemaitre commented Nov 3, 2020

Uh oh!

NicolasHug commented Nov 3, 2020

Uh oh!

glemaitre commented Nov 3, 2020

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

adrinjalali Nov 5, 2020

Choose a reason for hiding this comment

Uh oh!

adrinjalali Nov 5, 2020

Choose a reason for hiding this comment

Uh oh!

adrinjalali Nov 5, 2020

Choose a reason for hiding this comment

Uh oh!

adrinjalali Nov 5, 2020

Choose a reason for hiding this comment

Uh oh!

NicolasHug commented Nov 5, 2020

Uh oh!

rtavenar commented Nov 7, 2020

Uh oh!

adrinjalali commented Nov 8, 2020

Uh oh!

NicolasHug commented Nov 8, 2020

Uh oh!

NicolasHug commented Nov 8, 2020

Uh oh!

rth commented Nov 8, 2020

Uh oh!

NicolasHug commented Nov 9, 2020

Uh oh!

Uh oh!

rth commented Nov 3, 2020 •

edited

Loading

NicolasHug commented Nov 3, 2020 •

edited

Loading