Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ENH Support for XFAIL/XPASS in common tests #16306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from

Conversation

rth
Copy link
Member

@rth rth commented Jan 30, 2020

This adds support for marking common tests as a known failure in pytest.

Motivation

There are common checks that should pass for estimators, but in reality fail. These are typically either skipped by raising SkipTest, or not committed in master until all estimators pass (e.g. #15015).

With this approach we can instead mark such tests as a known failure, which will not show as an error but will be shown in the final test report e.g.,

XFAIL sklearn/tests/test_common.py::test_estimators[BernoulliRBM()-check_methods_subset_invariance]
  reason: score_samples of BernoulliRBM is not invariant when applied to a subset.

In addition, we can mark such tests as a known failure without raising the exception, the test will continue to be executed and will be marked as XFAIL (if it fails) or XPASS (if it passes) at the end. This can be implemented by passing the pytest request fixture to the check (optional). The use case of XPASS is when the failure was fixed in a PR but the corresponding common check was not modified.

If pytest is not installed or we are not inside a scikit-learn pytest session, this will not change the behavior of any way and a skip test will be raised, as before. I don't think this feature is useful for contrib projects that use check_estimators, since tests marked as failure are scikit-learn estimators exclusively.

The goal of this is that for checks that require compliance of multiple estimators, first add a check marked as XFAIL. Then let contributors add PRs to fix individual estimators, and at any time in the process show up to date information on master on what was fixed and what wasn't. I don't think this needs additional documentation at this point, rather individual issues should explain what needs to be done in their case.

In particular I would like to apply this for #15015 (comment), #16290 and possibly #16286 so it would be nice if it was merged during the Paris sprint.

Possibly cc @glemaitre @adrinjalali @jeremiedbb @lesteve

@adrinjalali
Copy link
Member

I don't know pytest enough to comment on the correctness of it. My question is, how can we then run the tests and get the failing list if we want to?

@rth
Copy link
Member Author

rth commented Jan 30, 2020

My question is, how can we then run the tests and get the failing list if we want to?

Yes, forgot about that. Running pytest with the --runxfail option will run tests marked as a known failure (but also all the normal tests), and errors will be reported with tracebacks as usual.

@@ -13,7 +13,7 @@ addopts =
--ignore maint_tools
--doctest-modules
--disable-pytest-warnings
-rs
-rxXs
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This says to show XFAIL and XPASS tests in the final summary report (that previously only included SKIP)

@rth
Copy link
Member Author

rth commented Jan 30, 2020

BTW, the current XPASS output for common tests,

XPASS sklearn/tests/test_common.py::test_estimators[MiniBatchSparsePCA()-check_methods_subset_invariance] transform of MiniBatchSparsePCA is not invariant when applied to a subset.
XPASS sklearn/tests/test_common.py::test_estimators[NuSVC()-check_methods_subset_invariance] decision_function of NuSVC is not invariant when applied to a subset.
XPASS sklearn/tests/test_common.py::test_estimators[SparsePCA()-check_methods_subset_invariance] transform of SparsePCA is not invariant when applied to a subset.

indicates that while these checks were skipped in the past, they currently pass without an exception (at least on my laptop).

@adrinjalali
Copy link
Member

Yes, forgot about that. Running pytest with the --runxfail option will run tests marked as a known failure (but also all the normal tests), and errors will be reported with tracebacks as usual.

Could we have this in our guides somewhere please? :d.

@cmarmo fixing these failing, or unmarking the currently passing ones, may be a bunch of good first issues if you have a chance to create issues related to them every now and then :)

@rth
Copy link
Member Author

rth commented Jan 30, 2020

Could we have this in our guides somewhere please? :d.

It's in the pytest documentation https://docs.pytest.org/en/latest/skipping.html#ignoring-xfail I don't think we should copy pytest documentation inside scikit-learn :)

@@ -87,13 +88,20 @@ def _tested_estimators():


@parametrize_with_checks(_tested_estimators())
def test_estimators(estimator, check):
def test_estimators(estimator, check, request):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

request is a built-in pytest fixture providing information of the requesting test function.

@adrinjalali
Copy link
Member

tips.rst does have some though, which is where I learned some useful ones when I started, and found it useful. pytest's doc was just too much to go through to find the useful ones. But no strong feelings.

request: default=None
result of the pytest request fixture.
"""
if not getattr(sys, "_is_pytest_session", False):
Copy link
Member Author

@rth rth Jan 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is set at the begining of the test session in our conftest.py

@adrinjalali
Copy link
Member

I'm happy to have this for those tests to move forward and for us to fix them gradually.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really reviewed yet

else:
# mark test as XFAIL and continue excecution to see if it will
# actually fail.
request.applymarker(pytest.mark.xfail(run=False, reason=reason))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think we need to set run=True to continue execution.

Suggested change
request.applymarker(pytest.mark.xfail(run=False, reason=reason))
request.applymarker(pytest.mark.xfail(run=True, reason=reason))

But it seems like the function will continue to run regardless of the parameter. (This marker will not stop the function from running).

Either way, this is the desired behavior.

Copy link
Member Author

@rth rth Jan 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it seems like the function will continue to run regardless of the parameter. (This marker will not stop the function from running).

Hah, yes, I found this is some github discussion a while ago. Not too sure about different parameters, it does work as expected though :)

Edit: reverted to the default run=True which is consistent with your comment.

@jeremiedbb
Copy link
Member

According to the coverage the branch for NaiveBayes is left over

raise SkipTest('XFAIL ' + str(reason))
try:
import pytest
if request is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can request be None ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not in the way it's used now. You are right -- simplified this function.

# mark test as XFAIL and continue excecution to see if it will
# actually fail.
request.applymarker(pytest.mark.xfail(reason=reason))
except ImportError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time to make pytest a dependency for the whole test suite ? :D
(I'm kidding I don't want to start a discussion here)

@rth
Copy link
Member Author

rth commented Jan 31, 2020

According to the coverage the branch for NaiveBayes is left over

Yes, because they don't have class_weight so the check is never run. It's not really a known failure. Removed it.

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this implementation

@thomasjpfan
Copy link
Member

Having a check in the public API being able to xfail itself based on estimator name still feels a little strange.

Can this be more generic, i.e. something like this:

def _skiptest(reason):
    raise SkipTest('XFAIL ' + reason)

def check_class_weight_classifiers(name, estimator_orig, xfailed=_skiptest):
    if name == "NuSVC":
        xfailed("Not testing NuSVC class weight as it is ignored.")
    ...


# in test_common.py
def _xfailed_func(request, reason):
    request.applymarker(pytest.mark.xfail(reason=reason))

def test_estimators(estimator, check, request):
    ...
    args = {}
    if "xfailed" in signature(check).parameters:
        args['xfailed'] = parital(_xfailed_func, request)
    check(estimator, *args)

@glemaitre
Copy link
Member

I think that I prefer this solution. This is close to what is already written. I think that in all the cases, we should move either toward solving these failures or just tagging them as known with estimator tags.
Adding the solution with the global dictionary will duplicate the effort of the estimator tag.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @thomasjpfan that it remains awkward that the duty of marking an estimator as exceptional belongs to the check. It makes more sense for that marking to happen in test_common, and now that we have turned check_estimator into a check generator, it seems as if that should be possible. One difficulty here is that the check is only skipped for selected methods. Does than not then sound like a limitation that could be expressed through tags? Or area tags inappropriate because they imply that this is correct behaviour, while here you are trying to express that it is incorrect behaviour??

@glemaitre
Copy link
Member

it remains awkward that the duty of marking an estimator as exceptional belongs to the check

I am fine with the changes proposed here: #16306 (comment)

I just feel the approach in this PR (with the changes proposed as well) better suited than the alternative #16328 due to the introduction of a new global dictionary (which is something that we tried to get rid of when we had some global list to avoid some tests).

One difficulty here is that the check is only skipped for selected methods. Does than not then sound like a limitation that could be expressed through tags? Or area tags inappropriate because they imply that this is correct behaviour, while here you are trying to express that it is incorrect behaviour??

Maybe the check should be more granular and the _skip_test tag could be a list of the tests to be skipped?

@rth
Copy link
Member Author

rth commented Feb 20, 2020

I would be OK making this work with estimator tags by generalizing _skip_test to take a list of regexp, introducing an equivalent _xfail_test estimator tag.

That would address @thomasjpfan 's concerns without having to maintain a dict of skipped checks in test_estimators (or globally) and would generalize nicely to contrib projects.

@rth
Copy link
Member Author

rth commented Feb 20, 2020

Thanks for the suggestion in #16306 (comment) @thomasjpfan , I agree it's an improvement, haven't had time to address it so far. Now however I'm more leaning toward a estimator tag solution.

@glemaitre glemaitre closed this Feb 20, 2020
@rth rth deleted the common-test-known-failure branch February 20, 2020 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants