-
-
Notifications
You must be signed in to change notification settings - Fork 26.3k
ENH add from_cv_results
in PrecisionRecallDisplay
(single Display)
#30508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to mention, I think I would like to decide on the order parameters for these display classes and their methods. They seem to have a lot of overlap and it would be great if they could be consistent.
I know that this would not matter when using the methods but it would be nice for the documentation API page if they were consistent?
name_ = [name_] * n_multi if name_ is None else name_ | ||
average_precision_ = ( | ||
[None] * n_multi if self.average_precision is None else self.average_precision |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this, but could not immediately think of a better way to do it
precision_all.append(precision) | ||
recall_all.append(recall) | ||
ap_all.append(average_precision) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't like this but not sure on the zip suggested in #30399 (comment) as you've got to unpack at the end 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some notes on review suggestions. Namely to make all the multi class params (precisions
, recalls
etc) list of ndarrays.
Also realised we did not need separate plot_single_curve
function, as most of the complexity was in _get_line_kwargs
names : str, default=None | ||
Names of each precision-recall curve for labeling. If `None`, use | ||
name provided at `PrecisionRecallDisplay` initialization. If not | ||
provided at initialization, no labeling is shown. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems reasonable that if we change the name
parameter in the class init, we should change it here to, especially as we don't advocate people to use plot
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed this with @glemaitre and decided that it is okay to change to names
. We should however make it clear what this is setting - the label of the curve in the legend.
The problem use case we thought about was if someone created a plot and display object, then wanted to add one curve to it using plot
, names
would not make sense in this case. However, it would be difficult for us to manage the legend in such a case, so decided that it would be up to the user to manage the legend in such a case.
Just wanted to document here that we discussed a potential enhancement for comparing between estimators, where you have cv results from several estimators (so several fold curves for each estimator). Potentially this could be added as a separate function, where you pass the display object, and estimators desired. Not planned, just a potential additional in future. |
Hey, I think that you can revive this PR now that the roc curve is merged. Let's try to reuse code from the other PR if possible :) |
Thanks @jeremiedbb ! I think @glemaitre mentioned there was some discussion about what to do with the 'chance' level (average precision). In the current PR I have calculated a single average precision (AP) for all the data. I think others suggested that we should calculate average precision for each fold, which I can see is more accurate but I am concerned about the visualization appearance. Here I have used 5 cv splits and plotting chance for each, and colouring each pair of precision-recall curve/chance line the same colour: Some concerns about the visualization:
I will have more of a think of a better solution for this. |
So for the "Chance level" I would consider all lines to have the same color (and a lower alpha) and in the legend to have a single entry showing the mean + std. I would think it is enough. Also it is easy to link a chance level line with its PR curve because they meet when the recall is 1.0 |
|
def test_precision_recall_display_string_labels(pyplot): |
When y
is composed of string labels:
from_predictions
raises an error ifpos_label
is not explicitly passed (via_check_pos_label_consistency
). This makes sense, as we cannot guess whatpos_label
should be.from_estimator
does not raise an error because we default toestimator.classes_[1]
(_get_response_values_binary
does this).
I think it is reasonable for from_cv_results
to also default to estimator.classes_[-1]
(this is indeed what we have in the docstring, but it is NOT what are doing in main). This case is a bit more complicated than from_estimator
because we have the problem where it is possible that not every class is present in each split (see #29558) - thus we could end up with different pos_labels
. Still thinking through this, but I think I would be happy to check that if pos_label
is not explicitly passed, it has been inferred to be the same for every split. WDYT @glemaitre ?
Edit: Actually, I think all estimators would raise an error if there are less than 2 classes, so we can just leave it to the estimator.
cv_results["estimator"], cv_results["indices"]["test"] | ||
): | ||
y_true = _safe_indexing(y, test_indices) | ||
y_pred, pos_label_ = _get_response_values_binary( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if there needs to be any note/comment here.
_get_response_values_binary
infers pos_label_
to be classes[-1]
when pos_label=None
, which is fine unless:
- data is binary but one class is missing in a cv fold - in this case the estimator will raise an error that it was fit on data that only contains one class
- data is multi-class -
precision_recall_curve
will complain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it is fine. We will get some expected error.
@jeremiedbb would you be interested in taking a look at this? Guillaume is probably a bit busy at the moment. Thanks |
Just pushed a small commit to resolve the conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will check more in details the tests now and the coverage issue.
_validate_style_kwargs({"label": label}, curve_kwargs[fold_idx]) | ||
_validate_style_kwargs( | ||
{"label": label, **default_curve_kwargs_}, curve_kwargs[fold_idx] | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it makes more sense. We could argue that it is a fix because we overwrite the full dict instead of a key-value inside.
cv_results["estimator"], cv_results["indices"]["test"] | ||
): | ||
y_true = _safe_indexing(y, test_indices) | ||
y_pred, pos_label_ = _get_response_values_binary( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it is fine. We will get some expected error.
Regarding the coverage, it seems that one of the |
I looked at the tests and they look good. I will now have a look at the example to see where we can introduced this feature. |
I think that we could introduce the |
Thanks for the review @glemaitre . I've added a small section to introduce |
Ping @jeremiedbb maybe you could take a look as Guillaume is busy. Thank you! |
Reference Issues/PRs
Follows on from #30399
What does this implement/fix? Explain your changes.
Proof of concept of adding multi displays to
PrecisionRecallDisplay
from_cv_results
inRocCurveDisplay
(singleRocCurveDisplay
) #30399, so we can definitely factorize out, though small intricacies may make it complexplot
method is complex due to handling both single and multi curve and doing a lot more checking, as user is able to use it outside of thefrom_estimator
andfrom_predictions
methods.Detailed discussions of problems in review comments.
Any other comments?
cc @glemaitre @jeremiedbb