DOC Improve `pos_label` and `labels` in precison/recall/f1 and jaccard #27714

lucyleeow · 2023-11-03T09:47:39Z

Reference Issues/PRs

Towards #10010 - removal of pos_label=None as it is ill-defined and inconsistently implemented.

What does this implement/fix? Explain your changes.

Documention of pos_labels=None in precision_recall_fscore_support seems to have been removed in this commit. Prior to v0.18 you needed to set pos_label=None if targets were binary but you wanted to use average != 'binary' (ref). Now pos_label is just ignored if average != 'binary', so you no longer need to worry about setting pos_label to any specific value. (The related functions f1_score / fbeta_score / precision_score / recall_score behave the same)

History of the similar jaccard_score was more difficult to track. pos_label was added in #13151 but there was never any mention of None option.

Implementation wise, all these functions will allow pos_label=None. pos_label is only used if average='binary' and in this case pos_label=None will raise an error.

This PR:

Removes None as pos_label option in f1_score docstring (the only function that still had None in its docstring)
Remove mention of pos_label=None in precision_recall_fscore_support docstring:

scikit-learn/sklearn/metrics/_classification.py

Lines 1580 to 1582 in fb6b9f5

    
               If ``pos_label is None`` and in binary classification, this function 
        
               returns the average precision, recall and F-measure if ``average`` 
        
               is one of ``'micro'``, ``'macro'``, ``'weighted'`` or ``'samples'``.

and add clarifications about pos_label and labels params.

Add the above pos_label and labels explanation to the other 5 functions as well
Clarify the including/excluding labels labels param doc:

scikit-learn/sklearn/metrics/_classification.py

Lines 1609 to 1612 in 2fcf181

    
                   order if ``average is None``. Labels present in the data can be 
        
                   excluded, for example to calculate a multiclass average ignoring a 
        
                   majority negative class, while labels not present in the data will 
        
                   result in 0 components in a macro average. For multilabel targets,

Looking at the commit this was added, the related addition to model_evaluation.rst seems to expand what the docstring above means so I've used this to clarify.

Any other comments?

tl;dr update pos_label and labels explanations in docstrings of the following 6 functions:

precision_recall_fscore_support / f1_score / fbeta_score / precision_score / recall_score
jaccard_score

lucyleeow · 2023-11-03T09:47:59Z

cc @glemaitre

github-actions · 2023-11-03T09:49:23Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: b8f10b2. Link to the linter CI: here}

glemaitre · 2023-11-03T15:10:26Z

sklearn/metrics/_classification.py

+    Jaccard similarity coefficient for `pos_label`. If `average` is not `'binary'`,
+    `pos_label` is ignored and scores for both classes are calculated, and averaged or
+    both returned (when `average=None`). Similarly, for :term:`multiclass` and
+    :term:`multilabel` targets, scores for all `labels` can be returned or use


It would be nice to remove "can be"

I've amended to not include "can be", hopefully it is better.

glemaitre · 2023-11-03T17:28:00Z

sklearn/metrics/_classification.py

+    and F1 score for both classes are calculated and averaged or both returned (when
+    `average=None`). Similarly, for :term:`multiclass` and :term:`multilabel` targets,
+    F1 score for all `labels` can be returned or use `average` to specify the averaging
+    technique to be used. Use `labels` specify the labels to calculate F1 score for.


I am always wondering if it is more natural to say "compute" or "calculate" :) (you can change depending what you like better here and in other places)

glemaitre · 2023-11-03T17:29:14Z

sklearn/metrics/_classification.py

+    label. For the :term:`binary` case, setting `average='binary'` will return
+    metrics for `pos_label`. If `average` is not `'binary'`, `pos_label` is ignored
+    and metrics for both classes are calculated, and averaged or both returned (when
+    `average=None`).Similarly, for :term:`multiclass` and :term:`multilabel` targets,


Suggested change

`average=None`).Similarly, for :term:`multiclass` and :term:`multilabel` targets,

`average=None`). Similarly, for :term:`multiclass` and :term:`multilabel` targets,

glemaitre

It looks much better @lucyleeow.

glemaitre · 2023-11-04T10:13:28Z

Going to merge this one since this is a DOC PR.

scikit-learn#27714)

lucyleeow added 4 commits October 21, 2023 14:16

use labels instead of pos_label

4baf8ac

add to pos_label arg

88c6833

Merge branch 'main' into doc_prfs

f66d374

improve prfs docs

7dc310b

github-actions bot added the module:metrics label Nov 3, 2023

github-actions bot added the Documentation label Nov 3, 2023

glemaitre self-requested a review November 3, 2023 15:08

glemaitre approved these changes Nov 3, 2023

View reviewed changes

review

b8f10b2

lucyleeow added the Waiting for Second Reviewer First reviewer is done, need a second one! label Nov 4, 2023

glemaitre merged commit 361b09e into scikit-learn:main Nov 4, 2023

lucyleeow deleted the doc_prfs branch November 4, 2023 10:27

lucyleeow mentioned this pull request Nov 10, 2023

MAINT Deprecate None option in pos_label for precison/recall/f1 and jaccard metrics #27762

Closed

REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023

DOC Improve pos_label and labels in precison/recall/f1 and jaccard (

0b84993

scikit-learn#27714)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC Improve `pos_label` and `labels` in precison/recall/f1 and jaccard #27714

DOC Improve `pos_label` and `labels` in precison/recall/f1 and jaccard #27714

Uh oh!

lucyleeow commented Nov 3, 2023 •

edited

Loading

Uh oh!

lucyleeow commented Nov 3, 2023

Uh oh!

github-actions bot commented Nov 3, 2023 •

edited

Loading

Uh oh!

glemaitre Nov 3, 2023

Uh oh!

lucyleeow Nov 4, 2023

Uh oh!

glemaitre Nov 3, 2023

Uh oh!

glemaitre Nov 3, 2023

Uh oh!

glemaitre left a comment

Uh oh!

glemaitre commented Nov 4, 2023

Uh oh!

Uh oh!

	If ``pos_label is None`` and in binary classification, this function
	returns the average precision, recall and F-measure if ``average``
	is one of ``'micro'``, ``'macro'``, ``'weighted'`` or ``'samples'``.

	order if ``average is None``. Labels present in the data can be
	excluded, for example to calculate a multiclass average ignoring a
	majority negative class, while labels not present in the data will
	result in 0 components in a macro average. For multilabel targets,

	`average=None`).Similarly, for :term:`multiclass` and :term:`multilabel` targets,
	`average=None`). Similarly, for :term:`multiclass` and :term:`multilabel` targets,

Uh oh!

DOC Improve pos_label and labels in precison/recall/f1 and jaccard #27714

DOC Improve pos_label and labels in precison/recall/f1 and jaccard #27714

Uh oh!

Conversation

lucyleeow commented Nov 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

lucyleeow commented Nov 3, 2023

Uh oh!

github-actions bot commented Nov 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

glemaitre Nov 3, 2023

Choose a reason for hiding this comment

Uh oh!

lucyleeow Nov 4, 2023

Choose a reason for hiding this comment

Uh oh!

glemaitre Nov 3, 2023

Choose a reason for hiding this comment

Uh oh!

glemaitre Nov 3, 2023

Choose a reason for hiding this comment

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

glemaitre commented Nov 4, 2023

Uh oh!

Uh oh!

DOC Improve `pos_label` and `labels` in precison/recall/f1 and jaccard #27714

DOC Improve `pos_label` and `labels` in precison/recall/f1 and jaccard #27714

lucyleeow commented Nov 3, 2023 •

edited

Loading

github-actions bot commented Nov 3, 2023 •

edited

Loading