-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
DOC Improve pos_label
and labels
in precison/recall/f1 and jaccard
#27714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @glemaitre |
sklearn/metrics/_classification.py
Outdated
Jaccard similarity coefficient for `pos_label`. If `average` is not `'binary'`, | ||
`pos_label` is ignored and scores for both classes are calculated, and averaged or | ||
both returned (when `average=None`). Similarly, for :term:`multiclass` and | ||
:term:`multilabel` targets, scores for all `labels` can be returned or use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to remove "can be"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've amended to not include "can be", hopefully it is better.
sklearn/metrics/_classification.py
Outdated
and F1 score for both classes are calculated and averaged or both returned (when | ||
`average=None`). Similarly, for :term:`multiclass` and :term:`multilabel` targets, | ||
F1 score for all `labels` can be returned or use `average` to specify the averaging | ||
technique to be used. Use `labels` specify the labels to calculate F1 score for. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am always wondering if it is more natural to say "compute" or "calculate" :) (you can change depending what you like better here and in other places)
sklearn/metrics/_classification.py
Outdated
label. For the :term:`binary` case, setting `average='binary'` will return | ||
metrics for `pos_label`. If `average` is not `'binary'`, `pos_label` is ignored | ||
and metrics for both classes are calculated, and averaged or both returned (when | ||
`average=None`).Similarly, for :term:`multiclass` and :term:`multilabel` targets, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
`average=None`).Similarly, for :term:`multiclass` and :term:`multilabel` targets, | |
`average=None`). Similarly, for :term:`multiclass` and :term:`multilabel` targets, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks much better @lucyleeow.
Going to merge this one since this is a DOC PR. |
Reference Issues/PRs
Towards #10010 - removal of
pos_label=None
as it is ill-defined and inconsistently implemented.What does this implement/fix? Explain your changes.
Documention of
pos_labels=None
inprecision_recall_fscore_support
seems to have been removed in this commit. Prior to v0.18 you needed to setpos_label=None
if targets were binary but you wanted to useaverage != 'binary'
(ref). Nowpos_label
is just ignored ifaverage != 'binary'
, so you no longer need to worry about settingpos_label
to any specific value. (The related functionsf1_score
/fbeta_score
/precision_score
/recall_score
behave the same)History of the similar
jaccard_score
was more difficult to track.pos_label
was added in #13151 but there was never any mention ofNone
option.Implementation wise, all these functions will allow
pos_label=None
.pos_label
is only used ifaverage='binary'
and in this casepos_label=None
will raise an error.This PR:
None
aspos_label
option inf1_score
docstring (the only function that still hadNone
in its docstring)pos_label=None
inprecision_recall_fscore_support
docstring:scikit-learn/sklearn/metrics/_classification.py
Lines 1580 to 1582 in fb6b9f5
and add clarifications about
pos_label
andlabels
params.pos_label
andlabels
explanation to the other 5 functions as welllabels
param doc:scikit-learn/sklearn/metrics/_classification.py
Lines 1609 to 1612 in 2fcf181
Looking at the commit this was added, the related addition to
model_evaluation.rst
seems to expand what the docstring above means so I've used this to clarify.Any other comments?
tl;dr update
pos_label
andlabels
explanations in docstrings of the following 6 functions:precision_recall_fscore_support
/f1_score
/fbeta_score
/precision_score
/recall_score
jaccard_score