ENH add zero_division in balanced_accuracy_score #28038

TengaCortal · 2023-12-31T12:34:52Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This addresses an inconsistency in the balanced_accuracy_score function, where the calculated balanced accuracy was not equal to the macro-average recall score. The issue was traced to the absence of the handling of zero division which resulted in unexpected discrepancies. To rectify this, the implementation was modified to ensure that zero division is appropriately handled, and the adjusted balanced accuracy is consistent with the macro-average recall score. These changes guarantee that the balanced accuracy aligns with the expected behavior. The test suite was updated to reflect the corrected behavior and to ensure the accuracy of the metric in various scenarios.

Any other comments?

github-actions · 2023-12-31T12:36:14Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 54dd010. Link to the linter CI: here}

…ngaCortal/scikit-learn into 26892-balanced-accuracy-score

glemaitre

Thanks @TengaCortal.

Here is a review. I think that the check of the C.sum(axis=1) is unnecessary and we already catch this case by checking nan.

The unit test should also not reimplement the same computation than the function (we can be twice wrong). It is better to check for the regression and we have the possibility to check that we are consistent with the averaged recall score.

sklearn/metrics/_classification.py

doc/whats_new/v1.5.rst

sklearn/metrics/tests/test_classification.py

doc/whats_new/v1.5.rst

glemaitre

It looks good. Only a couple of nitpicks for consistency. We are only the case of np.nan for consistency.

glemaitre · 2024-05-19T18:01:13Z

sklearn/metrics/_classification.py

+    zero_division : {"warn", 0, 1}, default="warn"
+        Sets the value to return when there is a zero division. If set to "warn",
+        a warning will be raised and 0 will be returned. If set to 0, the metric
+        will be 0, and if set to 1, the metric will be 1.


Suggested change

will be 0, and if set to 1, the metric will be 1.

will be 0, and if set to 1, the metric will be 1.

.. versionadded:: 1.6

glemaitre · 2024-05-19T18:02:40Z

sklearn/metrics/_classification.py

@@ -2419,6 +2426,11 @@ def balanced_accuracy_score(y_true, y_pred, *, sample_weight=None, adjusted=Fals
        performance would score 0, while keeping perfect performance at a score
        of 1.

+    zero_division : {"warn", 0, 1}, default="warn"


Suggested change

zero_division : {"warn", 0, 1}, default="warn"

zero_division : {"warn", 0.0, 1.0, np.nan}, default="warn"

To be conistent with other metrics, we need to offer the np.nan option as well.

glemaitre · 2024-05-19T18:03:30Z

sklearn/metrics/_classification.py

+        Sets the value to return when there is a zero division. If set to "warn",
+        a warning will be raised and 0 will be returned. If set to 0, the metric
+        will be 0, and if set to 1, the metric will be 1.
+


Suggested change

Sets the value to return when there is a zero division. If set to "warn",

a warning will be raised and 0 will be returned. If set to 0, the metric

will be 0, and if set to 1, the metric will be 1.

Sets the value to return when there is a zero division.

Notes:

- If set to "warn", this acts like 0, but a warning is also raised.

- If set to `np.nan`, such values will be excluded from the average.

sklearn/metrics/_classification.py

sklearn/metrics/tests/test_classification.py

glemaitre · 2024-10-29T18:04:26Z

@lucyleeow This one is also of interest if you can have a look for a second review.

glemaitre

I pushed my changes directly to try to make it for 1.6

glemaitre · 2024-10-31T12:19:40Z

Also @adrinjalali you might have a look as well to make it for 1.6

adrinjalali

@glemaitre WDYT?

adrinjalali · 2024-10-31T15:25:28Z

sklearn/metrics/_classification.py

+    zero_division : {"warn", 0.0, 1.0, np.nan}, default="warn"
+        Sets the value to return when there is a zero division.
+
+        Notes:
+
+        - If set to "warn", this acts like 0, but a warning is also raised.
+        - If set to `np.nan`, such values will be excluded from the average.
+
+        .. versionadded:: 1.6


I think the docstring as is, is vague and doesn't give enough information.

For instance, this warning is now removed: y_pred contains classes not in y_true.

The diff here seems more complicated than the other instances of adding zero_division, and this parameter is doing more than simply replacing a nan of a division by zero.

Better explaining the actual behavior, it'd be more clear how to proceed.

We could improve the current warning to give more info - both warings (old and new) are correct: when y_pred contains classes not in y_true, tp + fn = 0 (sum of true labels) for that class, this results in zero division and balance accuracy is ill-defined in this case.

How about we improve the warning to give more info, e.g., it is ill-defined because y_pred contains classes not in y_true (resulting in zero division when calculating recall)?

I appreciate it is not similar to precision/recall/f1 where there is explicit averaging/an explicit average parameter, so it's not so obvious that averaging is happening. We do say above: "It is defined as the average of recall obtained on each class" but we could be more explicitly explain in zero_division parameter that the zero division occurs when calculating recall for each class?

Similar to what we do in cohen_kappa_score:

scikit-learn/sklearn/metrics/_classification.py

Lines 740 to 744 in 56bbb5a

zero_division : {"warn", 0.0, 1.0, np.nan}, default="warn"

Sets the return value when there is a zero division. This is the case when both

labelings `y1` and `y2` both exclusively contain the 0 class (e. g.

`[0, 0, 0, 0]`) (or if both are empty). If set to "warn", returns `0.0`, but a

warning is also raised.

(though cohen_kappa_score is even more different and there is no averaging)

lucyleeow

I think I agree with @adrinjalali, in that the warning and zero_division docstring could be improved upon, but otherwise this looks good to me! Thanks!

lucyleeow · 2024-11-01T06:17:18Z

sklearn/metrics/tests/test_classification.py

+    assert balanced_accuracy == pytest.approx(expected_score)
+
+    # check the consistency with the averaged recall score per-class
+    with warnings.catch_warnings(record=True):


Why are we using record=True here?

We don't need to because we don't use the output indeed.

glemaitre · 2024-11-05T18:49:04Z

I pushed a commit to improve the documentation and acknowledge the definition of the balanced accuracy as the average of recalls to make it explicit what do we mean by average.

adrinjalali · 2024-11-05T19:48:15Z

sklearn/metrics/_classification.py

+        - If set to `np.nan`, such values will be excluded from the average when
+          computing the balanced accuracy as the average of the recalls.


I'm confused with the semantics of this now. Shouldn't zero_division=np.nan to me would mean "if there's a zero division, give me np.nan", but this is kinda the opposite. To me zero_dicision="ignore" would be easier to mean what we have written here.

Basically, we are consistent with the other metric and with previous definition.

recall_score(y_true, y_pred, average="macro", zero_division=np.nan)

I don't think that we should change this behaviour here.

What make the semantic weird here is that the balanced accuracy have this average without any keyword. However, I would find it weird that the results does not match the definition of the recall_score. So it would mean that if we don't like this semantic, we should change for all zero_division parameters around.

I really feel like we're continuing a wrong path here.

Reading the code

recall_score(y_true, y_pred, average="macro", zero_division=np.nan)

w/o reading the docstring, I really expect to get np.nan if there's any zero_division in calculations, not to have them ignored in an aggregate calculation.

I'd be more in favor of deprecating that usage of zero_division=np.nan in favor of zero_division="ignore" or similar.

This is the situation now:

accuracy_score, classification_report, cohen_kappa_score, jaccard_score, matthews_corrcoef are all variations of this text:

zero_division : {"warn", 0.0, 1.0, np.nan}, default="warn" Sets the value to return when there is a zero division, e.g. when `y_true` and `y_pred` are empty. If set to "warn", returns 0.0 input, but a warning is also raised. versionadded:: 1.6

However, classification_report calls precision_recall_fscore_support, which among other metrics has:

f1_score, fbeta_score, precision_recall_fscore_support, precision_score, recall_score:

zero_division : {"warn", 0.0, 1.0, np.nan}, default="warn" Sets the value to return when there is a zero division, i.e. when all predictions and labels are negative. Notes: - If set to "warn", this acts like 0, but a warning is also raised. - If set to `np.nan`, such values will be excluded from the average. .. versionadded:: 1.3 `np.nan` option was added.

I do think we need to change the status quo for the second batch, and make the written code by the user intuitive rather than very surprising.

I agree that there is something wrong here.

I don't like the semantic zero_division="ignore" because the exclusion happen at the aggregation level.

However, I'm wondering if we should not only return an aggregation that take the np.nan into account. Practically speaking, returning an average that does not take into account a certain class is actually dangerous. So I need to go back because I don't remember why this behaviour was chosen.

I more and more convinced that we need both np.nan and "ignore". np.nan will be taken into account in the final aggregation if there is one. "ignore" means that the result from the zero_division is ignore in the aggregation stage. My question know is if we have the "ignore" option, what is the behaviour when we don't aggregate?

Basically what do we expect from the following example:

recall_score([[0, 0], [0, 0]], [[1, 1], [1, 1]], average=None, zero_division="ignore")

Oh yeah, I wasn't saying we should remove np.nan from all of them. In my comment here (#28038 (comment)) to me the semantics of the first batch of the metrics (except classification report) is okay as is. We only need to change for the second batch.

np.nan will be taken into account in the final aggregation if there is one

Can you clarify what this means?

Clarification here: #29048 (comment)

lucyleeow

LGTM!

sklearn/metrics/_classification.py

Co-authored-by: Lucy Liu <[email protected]>

glemaitre · 2024-11-06T19:09:09Z

Moving this in 1.7 since this we will need to revisit it.

Correction of Balanced Accuracy Score

ebb10af

github-actions bot added the module:metrics label Dec 31, 2023

Tenga and others added 15 commits December 31, 2023 14:08

Linting issues fix

d3a053c

Minor fixes

ea539b3

added zero_division in validate_params

f96be3b

reformat with black

201ce3e

updated zero_division param validation

7e0f1dc

re-updated zero_division param validation

b2f7bf2

added test to cover zero_divisionn

1676722

updated test to cover zero_divisionn

83f3623

cleared incorrect code

2be4f4b

Merge branch 'main' into 26892-balanced-accuracy-score

37a44b7

added new test for balanced_accuracy_score with zero_division

027cba8

Merge branch '26892-balanced-accuracy-score' of https://github.com/Te…

37155f9

…ngaCortal/scikit-learn into 26892-balanced-accuracy-score

Merge branch 'main' into 26892-balanced-accuracy-score

d8d1597

modified changelog

81a019f

corrected changelog

e97e84f

glemaitre self-requested a review January 11, 2024 21:27

Merge remote-tracking branch 'origin/main' into pr/TengaCortal/28038

fd7701e

glemaitre changed the title ~~Fix: Ensure Consistency Between Balanced Accuracy and Macro-Average Recall~~ ENH add zero_division in balanced_accuracy_score Jan 16, 2024

glemaitre reviewed Jan 16, 2024

View reviewed changes

sklearn/metrics/_classification.py Outdated Show resolved Hide resolved

doc/whats_new/v1.5.rst Outdated Show resolved Hide resolved

sklearn/metrics/tests/test_classification.py Outdated Show resolved Hide resolved

sklearn/metrics/tests/test_classification.py Outdated Show resolved Hide resolved

Apply suggestions from code review

961e470

glemaitre reviewed May 18, 2024

View reviewed changes

doc/whats_new/v1.5.rst Outdated Show resolved Hide resolved

glemaitre self-requested a review May 18, 2024 13:03

glemaitre added 3 commits May 18, 2024 16:12

Merge remote-tracking branch 'origin/main' into pr/TengaCortal/28038

30bbce7

revert changelog change

b019167

correct changelog

57d9d3d

glemaitre reviewed May 19, 2024

View reviewed changes

glemaitre added this to the 1.6 milestone May 19, 2024

glemaitre self-requested a review October 29, 2024 17:08

glemaitre added 6 commits October 29, 2024 18:10

Merge remote-tracking branch 'origin/main' into pr/TengaCortal/28038

ebc1d5d

update changelog

4dbfe73

address review

729aec8

address comments from review

f7eb3f7

lint

3400ffc

fix docstring consistency

5ca6109

glemaitre approved these changes Oct 29, 2024

View reviewed changes

adrinjalali reviewed Oct 31, 2024

View reviewed changes

lucyleeow reviewed Nov 1, 2024

View reviewed changes

glemaitre self-requested a review November 5, 2024 16:46

address comments of Adrin and Lucy

386885e

glemaitre approved these changes Nov 5, 2024

View reviewed changes

adrinjalali reviewed Nov 5, 2024

View reviewed changes

fix sphinx warning

2dcab78

lucyleeow approved these changes Nov 6, 2024

View reviewed changes

sklearn/metrics/_classification.py Outdated Show resolved Hide resolved

Update sklearn/metrics/_classification.py

54dd010

Co-authored-by: Lucy Liu <[email protected]>

adrinjalali mentioned this pull request Nov 6, 2024

Make zero_division parameter consistent in the different metric #29048

Open

5 tasks

glemaitre modified the milestones: 1.6, 1.7 Nov 6, 2024

	zero_division : {"warn", 0, 1}, default="warn"
	zero_division : {"warn", 0.0, 1.0, np.nan}, default="warn"

	zero_division : {"warn", 0.0, 1.0, np.nan}, default="warn"
	Sets the return value when there is a zero division. This is the case when both
	labelings `y1` and `y2` both exclusively contain the 0 class (e. g.
	`[0, 0, 0, 0]`) (or if both are empty). If set to "warn", returns `0.0`, but a
	warning is also raised.

		- If set to `np.nan`, such values will be excluded from the average when
		computing the balanced accuracy as the average of the recalls.

Uh oh!

ENH add zero_division in balanced_accuracy_score #28038

Are you sure you want to change the base?

ENH add zero_division in balanced_accuracy_score #28038

Uh oh!

Conversation

TengaCortal commented Dec 31, 2023

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented Dec 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

glemaitre commented Oct 29, 2024

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

glemaitre commented Oct 31, 2024

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lucyleeow left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glemaitre commented Nov 5, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lucyleeow left a comment

Choose a reason for hiding this comment

github-actions bot commented Dec 31, 2023 •

edited

Loading