-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH Support for XFAIL/XPASS in common tests #16306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I don't know pytest enough to comment on the correctness of it. My question is, how can we then run the tests and get the failing list if we want to? |
Yes, forgot about that. Running pytest with the |
@@ -13,7 +13,7 @@ addopts = | |||
--ignore maint_tools | |||
--doctest-modules | |||
--disable-pytest-warnings | |||
-rs | |||
-rxXs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This says to show XFAIL and XPASS tests in the final summary report (that previously only included SKIP)
BTW, the current XPASS output for common tests,
indicates that while these checks were skipped in the past, they currently pass without an exception (at least on my laptop). |
Could we have this in our guides somewhere please? :d. @cmarmo fixing these failing, or unmarking the currently passing ones, may be a bunch of good first issues if you have a chance to create issues related to them every now and then :) |
It's in the pytest documentation https://docs.pytest.org/en/latest/skipping.html#ignoring-xfail I don't think we should copy pytest documentation inside scikit-learn :) |
@@ -87,13 +88,20 @@ def _tested_estimators(): | |||
|
|||
|
|||
@parametrize_with_checks(_tested_estimators()) | |||
def test_estimators(estimator, check): | |||
def test_estimators(estimator, check, request): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
request is a built-in pytest fixture providing information of the requesting test function.
|
sklearn/utils/estimator_checks.py
Outdated
request: default=None | ||
result of the pytest request fixture. | ||
""" | ||
if not getattr(sys, "_is_pytest_session", False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is set at the begining of the test session in our conftest.py
I'm happy to have this for those tests to move forward and for us to fix them gradually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really reviewed yet
Co-Authored-By: Joel Nothman <[email protected]>
sklearn/utils/estimator_checks.py
Outdated
else: | ||
# mark test as XFAIL and continue excecution to see if it will | ||
# actually fail. | ||
request.applymarker(pytest.mark.xfail(run=False, reason=reason)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think we need to set run=True
to continue execution.
request.applymarker(pytest.mark.xfail(run=False, reason=reason)) | |
request.applymarker(pytest.mark.xfail(run=True, reason=reason)) |
But it seems like the function will continue to run regardless of the parameter. (This marker will not stop the function from running).
Either way, this is the desired behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it seems like the function will continue to run regardless of the parameter. (This marker will not stop the function from running).
Hah, yes, I found this is some github discussion a while ago. Not too sure about different parameters, it does work as expected though :)
Edit: reverted to the default run=True
which is consistent with your comment.
According to the coverage the branch for NaiveBayes is left over |
sklearn/utils/estimator_checks.py
Outdated
raise SkipTest('XFAIL ' + str(reason)) | ||
try: | ||
import pytest | ||
if request is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can request be None ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in the way it's used now. You are right -- simplified this function.
# mark test as XFAIL and continue excecution to see if it will | ||
# actually fail. | ||
request.applymarker(pytest.mark.xfail(reason=reason)) | ||
except ImportError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
time to make pytest a dependency for the whole test suite ? :D
(I'm kidding I don't want to start a discussion here)
Yes, because they don't have class_weight so the check is never run. It's not really a known failure. Removed it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this implementation
Having a check in the public API being able to Can this be more generic, i.e. something like this: def _skiptest(reason):
raise SkipTest('XFAIL ' + reason)
def check_class_weight_classifiers(name, estimator_orig, xfailed=_skiptest):
if name == "NuSVC":
xfailed("Not testing NuSVC class weight as it is ignored.")
...
# in test_common.py
def _xfailed_func(request, reason):
request.applymarker(pytest.mark.xfail(reason=reason))
def test_estimators(estimator, check, request):
...
args = {}
if "xfailed" in signature(check).parameters:
args['xfailed'] = parital(_xfailed_func, request)
check(estimator, *args) |
I think that I prefer this solution. This is close to what is already written. I think that in all the cases, we should move either toward solving these failures or just tagging them as known with estimator tags. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @thomasjpfan that it remains awkward that the duty of marking an estimator as exceptional belongs to the check. It makes more sense for that marking to happen in test_common, and now that we have turned check_estimator into a check generator, it seems as if that should be possible. One difficulty here is that the check is only skipped for selected methods. Does than not then sound like a limitation that could be expressed through tags? Or area tags inappropriate because they imply that this is correct behaviour, while here you are trying to express that it is incorrect behaviour??
I am fine with the changes proposed here: #16306 (comment) I just feel the approach in this PR (with the changes proposed as well) better suited than the alternative #16328 due to the introduction of a new global dictionary (which is something that we tried to get rid of when we had some global list to avoid some tests).
Maybe the check should be more granular and the |
I would be OK making this work with estimator tags by generalizing That would address @thomasjpfan 's concerns without having to maintain a dict of skipped checks in |
Thanks for the suggestion in #16306 (comment) @thomasjpfan , I agree it's an improvement, haven't had time to address it so far. Now however I'm more leaning toward a estimator tag solution. |
This adds support for marking common tests as a known failure in pytest.
Motivation
There are common checks that should pass for estimators, but in reality fail. These are typically either skipped by raising
SkipTest
, or not committed in master until all estimators pass (e.g. #15015).With this approach we can instead mark such tests as a known failure, which will not show as an error but will be shown in the final test report e.g.,
In addition, we can mark such tests as a known failure without raising the exception, the test will continue to be executed and will be marked as XFAIL (if it fails) or XPASS (if it passes) at the end. This can be implemented by passing the pytest
request
fixture to the check (optional). The use case of XPASS is when the failure was fixed in a PR but the corresponding common check was not modified.If pytest is not installed or we are not inside a scikit-learn pytest session, this will not change the behavior of any way and a skip test will be raised, as before. I don't think this feature is useful for contrib projects that use check_estimators, since tests marked as failure are scikit-learn estimators exclusively.
The goal of this is that for checks that require compliance of multiple estimators, first add a check marked as XFAIL. Then let contributors add PRs to fix individual estimators, and at any time in the process show up to date information on master on what was fixed and what wasn't. I don't think this needs additional documentation at this point, rather individual issues should explain what needs to be done in their case.
In particular I would like to apply this for #15015 (comment), #16290 and possibly #16286 so it would be nice if it was merged during the Paris sprint.
Possibly cc @glemaitre @adrinjalali @jeremiedbb @lesteve