FIX show only accuracy when having a subset of labels in classification report #28399

vjoshi253 · 2024-02-10T19:19:46Z

Reference Issues/PRs

This PR fixes the issue #27927.

What does this implement/fix? Explain your changes.

There is an incosistency between the calculation of the micro-average in the code as compared to what the doc mentions.
The doc mentions that:

Micro average (averaging the total true positives, false negatives and false positives) is only shown for multi-label or multi-class with a subset of classes.

But the code even gives the micro-average for superset cases.

The fix is code is quite trivial and makes the code and documentation consistent.

Any other comments?

The author of the issue shared some reproduction steps and the expected output. This fix is able to get to the expected output.

print(classification_report([0, 1], [1, 0], labels=[0, 1, 2], zero_division=0.0))

Output:

github-actions · 2024-02-10T19:20:58Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 92520d5. Link to the linter CI: here}

glemaitre · 2024-02-12T10:41:52Z

We also need a non-regression test to check that we have the expected behaviour and entry in the changelog to acknowledge the bug fix.

…hi253/scikit-learn into Vin_ClassificationReportBug

vjoshi253 · 2024-02-15T13:55:42Z

@glemaitre Updated as per suggested.

glemaitre

Here are some comments.

glemaitre · 2024-02-19T11:10:25Z

doc/whats_new/v1.5.rst

  :class:`~metrics.PrecisionRecallDisplay`, :class:`~metrics.DetCurveDisplay`,
  :class:`~calibration.CalibrationDisplay`.
  :pr:`28051` by :user:`Pierre de Fréminville <pidefrem>`.
+- |Fix|:class:`metrics.classification_report` now shows only accuracy and not micro-average when input is a subset of labels.


This need to be moved in 1.4 under the 1.4.2 changelog. We will include it in the next bug fix release.

You need to make sure that you entry is less than 88 characters also in the changelog.

sklearn/metrics/tests/test_classification.py

glemaitre · 2024-02-19T11:14:19Z

sklearn/metrics/tests/test_classification.py



+def test_classification_report_input_subset_of_labels():
+    y_true, y_pred = [0, 1], [0, 1]


I think that we want a slightly different test isntead: we want that different input in labels show or not the "accuracy". So we could parametrize the test by trying labels=([0, 1, 2], [0, 1], [0]).

sklearn/metrics/tests/test_classification.py

glemaitre

I simplified the test just to check for the present key as this is the issue reported and I moved the entry in the right changelog as requested.

jeremiedbb

I move the what's new entry to target ~~1.5.~~ 1.4.2

LGTM. Thanks @vjoshi253

Fixed micro-average calculation bug

14e6f2f

github-actions bot added the module:metrics label Feb 10, 2024

glemaitre changed the title ~~Fixed micro-average calculation bug in "classification_report" metric.~~ FIX show only accuracy when having a subset of labels in classification report Feb 12, 2024

vjoshi253 and others added 6 commits February 14, 2024 19:50

Added unit test

699846b

Made changes to the test values

8a73a5f

Updated changelog

793acfe

Update changelog.

0ffd9bf

Updated test to assert upto 3 decimal places only

8216faf

Merge branch 'Vin_ClassificationReportBug' of https://github.com/vjos…

e28484e

…hi253/scikit-learn into Vin_ClassificationReportBug

glemaitre self-requested a review February 19, 2024 10:02

glemaitre reviewed Feb 19, 2024

View reviewed changes

Addressed few PR comments

defc287

glemaitre self-requested a review March 4, 2024 18:02

glemaitre added 3 commits March 4, 2024 21:53

move entry changelog

74575b0

iter

9fe4122

Merge remote-tracking branch 'origin/main' into pr/vjoshi253/28399

f18208b

glemaitre approved these changes Mar 4, 2024

View reviewed changes

glemaitre added the Waiting for Second Reviewer First reviewer is done, need a second one! label Mar 4, 2024

jeremiedbb added 2 commits March 5, 2024 18:06

move what's new

d85d928

Merge remote-tracking branch 'upstream/main' into pr/vjoshi253/28399

598dbc6

jeremiedbb approved these changes Mar 5, 2024

View reviewed changes

jeremiedbb added 2 commits March 5, 2024 18:11

oops move it to target 1.4.2

c57205d

cln

92520d5

jeremiedbb added this to the 1.4.2 milestone Mar 5, 2024

jeremiedbb added To backport PR merged in master that need a backport to a release branch defined based on the milestone. and removed Waiting for Second Reviewer First reviewer is done, need a second one! labels Mar 5, 2024

jeremiedbb enabled auto-merge (squash) March 5, 2024 17:17

jeremiedbb merged commit e5b0488 into scikit-learn:main Mar 5, 2024

jeremiedbb mentioned this pull request Mar 7, 2024

classification_report gives micro averages when labels is a superset of the observed labels #27927

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

FIX show only accuracy when having a subset of labels in classification report #28399

FIX show only accuracy when having a subset of labels in classification report #28399

Uh oh!

vjoshi253 commented Feb 10, 2024

Uh oh!

github-actions bot commented Feb 10, 2024 •

edited

Loading

Uh oh!

glemaitre commented Feb 12, 2024

Uh oh!

vjoshi253 commented Feb 15, 2024

Uh oh!

glemaitre left a comment

Uh oh!

glemaitre Feb 19, 2024

Uh oh!

glemaitre Feb 19, 2024

Uh oh!

Uh oh!

glemaitre Feb 19, 2024

Uh oh!

Uh oh!

glemaitre left a comment

Uh oh!

jeremiedbb left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants



		def test_classification_report_input_subset_of_labels():
		y_true, y_pred = [0, 1], [0, 1]

Uh oh!

FIX show only accuracy when having a subset of labels in classification report #28399

FIX show only accuracy when having a subset of labels in classification report #28399

Uh oh!

Conversation

vjoshi253 commented Feb 10, 2024

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented Feb 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

glemaitre commented Feb 12, 2024

Uh oh!

vjoshi253 commented Feb 15, 2024

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

glemaitre Feb 19, 2024

Choose a reason for hiding this comment

Uh oh!

glemaitre Feb 19, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

glemaitre Feb 19, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

jeremiedbb left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Feb 10, 2024 •

edited

Loading

jeremiedbb left a comment •

edited

Loading