-
-
Notifications
You must be signed in to change notification settings - Fork 26.6k
[MRG] Fix sklearn.metrics.classification._check_targets where y_true and y_pred are both binary but the union is multiclass #8377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| if y_type in ["binary", "multiclass"]: | ||
| y_true = column_or_1d(y_true) | ||
| y_pred = column_or_1d(y_pred) | ||
| if y_type == "binary": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the less invasive fix I could find. This is not super efficient because np.unique has been already called on y_true and y_pred to figure out their type in type_of_target.
294e698 to
e01f779
Compare
| y_true_inv = ["b" if i == "a" else "a" for i in y_true] | ||
|
|
||
| assert_almost_equal(matthews_corrcoef(y_true, y_true_inv), -1) | ||
| y_true_inv2 = label_binarize(y_true, ["a", "b"]) * -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
y_true_inv2 is an array of 0 and 1 whereas y_true is an array of 'a' and 'b'. The test was passing just by chance as noted in #8094 (comment).
| # And also for any other vector with 0 variance | ||
| mcc = assert_warns_message(RuntimeWarning, 'invalid value encountered', | ||
| matthews_corrcoef, y_true, | ||
| rng.randint(-100, 100) * np.ones(20, dtype=int)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the previous change, y_true is an array of 'a' and 'b' and we are trying to compare it to an array of ints, so something somewhere should tell us that something is wrong.
Codecov Report
@@ Coverage Diff @@
## master #8377 +/- ##
==========================================
+ Coverage 94.75% 94.75% +<.01%
==========================================
Files 342 342
Lines 60902 60911 +9
==========================================
+ Hits 57708 57717 +9
Misses 3194 3194
Continue to review full report at Codecov.
|
jnothman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. A bug fix changelog entry is deserved.
but the union of them is multiclass.
37474b1 to
b71c3b4
Compare
|
Thanks for the review, I added an entry in the changelog. |
|
Thanks you. |
|
Thanks for the review @MechCoder ! |
…and y_pred are both binary but the union is multiclass (scikit-learn#8377) * Fix _check_targets where y_true and y_pred are both binary but the union of them is multiclass. * Add entry in changelog
…and y_pred are both binary but the union is multiclass (scikit-learn#8377) * Fix _check_targets where y_true and y_pred are both binary but the union of them is multiclass. * Add entry in changelog
…and y_pred are both binary but the union is multiclass (scikit-learn#8377) * Fix _check_targets where y_true and y_pred are both binary but the union of them is multiclass. * Add entry in changelog
…and y_pred are both binary but the union is multiclass (scikit-learn#8377) * Fix _check_targets where y_true and y_pred are both binary but the union of them is multiclass. * Add entry in changelog
…and y_pred are both binary but the union is multiclass (scikit-learn#8377) * Fix _check_targets where y_true and y_pred are both binary but the union of them is multiclass. * Add entry in changelog
…and y_pred are both binary but the union is multiclass (scikit-learn#8377) * Fix _check_targets where y_true and y_pred are both binary but the union of them is multiclass. * Add entry in changelog
…and y_pred are both binary but the union is multiclass (scikit-learn#8377) * Fix _check_targets where y_true and y_pred are both binary but the union of them is multiclass. * Add entry in changelog
Reference Issue
Fix part of #8098.
What does this implement/fix? Explain your changes.
In the case where both
y_predandy_trueare binary it checks that the union of them only has two values otherwise returns'multiclass'as type.Any other comments?
I saw #8094 and thought that it was too many different things as once. This is an attempt of making it do one less thing.