ENH add `from_cv_results` in `PrecisionRecallDisplay` (single Display) #30508

lucyleeow · 2024-12-19T02:25:59Z

Reference Issues/PRs

Follows on from #30399

What does this implement/fix? Explain your changes.

Proof of concept of adding multi displays to PrecisionRecallDisplay

A lot of the code is similar to that in ENH add from_cv_results in RocCurveDisplay (single RocCurveDisplay) #30399, so we can definitely factorize out, though small intricacies may make it complex
The plot method is complex due to handling both single and multi curve and doing a lot more checking, as user is able to use it outside of the from_estimator and from_predictions methods.

Detailed discussions of problems in review comments.

Any other comments?

cc @glemaitre @jeremiedbb

github-actions · 2024-12-19T02:27:17Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: 92725d4. Link to the linter CI: here}

lucyleeow

I forgot to mention, I think I would like to decide on the order parameters for these display classes and their methods. They seem to have a lot of overlap and it would be great if they could be consistent.

I know that this would not matter when using the methods but it would be nice for the documentation API page if they were consistent?

sklearn/metrics/_plot/precision_recall_curve.py

lucyleeow · 2024-12-19T02:40:09Z

sklearn/metrics/_plot/precision_recall_curve.py

+            name_ = [name_] * n_multi if name_ is None else name_
+            average_precision_ = (
+                [None] * n_multi if self.average_precision is None else self.average_precision


I don't like this, but could not immediately think of a better way to do it

sklearn/metrics/_plot/precision_recall_curve.py

lucyleeow · 2024-12-19T02:41:58Z

sklearn/metrics/_plot/precision_recall_curve.py

+            precision_all.append(precision)
+            recall_all.append(recall)
+            ap_all.append(average_precision)


Don't like this but not sure on the zip suggested in #30399 (comment) as you've got to unpack at the end 🤔

lucyleeow

Some notes on review suggestions. Namely to make all the multi class params (precisions, recalls etc) list of ndarrays.

Also realised we did not need separate plot_single_curve function, as most of the complexity was in _get_line_kwargs

sklearn/metrics/_plot/precision_recall_curve.py

lucyleeow · 2024-12-31T00:45:48Z

sklearn/metrics/_plot/precision_recall_curve.py

+        names : str, default=None
+            Names of each precision-recall curve for labeling. If `None`, use
+            name provided at `PrecisionRecallDisplay` initialization. If not
+            provided at initialization, no labeling is shown.


It seems reasonable that if we change the name parameter in the class init, we should change it here to, especially as we don't advocate people to use plot directly.

Discussed this with @glemaitre and decided that it is okay to change to names. We should however make it clear what this is setting - the label of the curve in the legend.

The problem use case we thought about was if someone created a plot and display object, then wanted to add one curve to it using plot, names would not make sense in this case. However, it would be difficult for us to manage the legend in such a case, so decided that it would be up to the user to manage the legend in such a case.

sklearn/metrics/_plot/precision_recall_curve.py

lucyleeow · 2025-01-14T06:19:31Z

Just wanted to document here that we discussed a potential enhancement for comparing between estimators, where you have cv results from several estimators (so several fold curves for each estimator). Potentially this could be added as a separate function, where you pass the display object, and estimators desired. Not planned, just a potential additional in future.

jeremiedbb · 2025-06-12T21:08:06Z

Hey, I think that you can revive this PR now that the roc curve is merged. Let's try to reuse code from the other PR if possible :)

lucyleeow · 2025-06-13T06:45:05Z

Thanks @jeremiedbb !

I think @glemaitre mentioned there was some discussion about what to do with the 'chance' level (average precision). In the current PR I have calculated a single average precision (AP) for all the data. I think others suggested that we should calculate average precision for each fold, which I can see is more accurate but I am concerned about the visualization appearance.

Here I have used 5 cv splits and plotting chance for each, and colouring each pair of precision-recall curve/chance line the same colour:

Some concerns about the visualization:

it would not be unusual for AP to be the same, as has occurred above. The orange and blue lines have the same AP and we can see that results in a single brown-ish line.
in ROC curve, we decided to colour all CV folds the same colour by default, as we thought this would be most appropriate:
- as number of folds increases, it would be hard distinguish each line individually
- we assume we are interested in comparing the results between estimators, thus it makes sense to colour all the cv results from one estimator the same colour

I will have more of a think of a better solution for this.

glemaitre · 2025-06-13T09:42:44Z

So for the "Chance level" I would consider all lines to have the same color (and a lower alpha) and in the legend to have a single entry showing the mean + std. I would think it is enough. Also it is easy to link a chance level line with its PR curve because they meet when the recall is 1.0

lucyleeow · 2025-06-20T12:32:06Z

`pos_label` checking in display methods

I only realised this when looking at the test:

scikit-learn/sklearn/metrics/_plot/tests/test_precision_recall_display.py

Line 186 in 8792943

def test_precision_recall_display_string_labels(pyplot):

When y is composed of string labels:

from_predictions raises an error if pos_label is not explicitly passed (via _check_pos_label_consistency). This makes sense, as we cannot guess what pos_label should be.
from_estimator does not raise an error because we default to estimator.classes_[1] (_get_response_values_binary does this).

I think it is reasonable for from_cv_results to also default to estimator.classes_[-1] (this is indeed what we have in the docstring, but it is NOT what are doing in main). This case is a bit more complicated than from_estimator because we have the problem where it is possible that not every class is present in each split (see #29558) - thus we could end up with different pos_labels. Still thinking through this, but I think I would be happy to check that if pos_label is not explicitly passed, it has been inferred to be the same for every split. WDYT @glemaitre ?

Edit: Actually, I think all estimators would raise an error if there are less than 2 classes, so we can just leave it to the estimator.

sklearn/metrics/_plot/tests/test_common_curve_display.py

lucyleeow · 2025-06-23T05:22:38Z

sklearn/metrics/_plot/precision_recall_curve.py

+            cv_results["estimator"], cv_results["indices"]["test"]
+        ):
+            y_true = _safe_indexing(y, test_indices)
+            y_pred, pos_label_ = _get_response_values_binary(


Not sure if there needs to be any note/comment here.
_get_response_values_binary infers pos_label_ to be classes[-1] when pos_label=None, which is fine unless:

data is binary but one class is missing in a cv fold - in this case the estimator will raise an error that it was fit on data that only contains one class

data is multi-class - precision_recall_curve will complain

I think that it is fine. We will get some expected error.

lucyleeow · 2025-07-15T11:58:54Z

@jeremiedbb would you be interested in taking a look at this? Guillaume is probably a bit busy at the moment. Thanks

glemaitre · 2025-07-29T08:23:18Z

Just pushed a small commit to resolve the conflicts.

glemaitre

I will check more in details the tests now and the coverage issue.

glemaitre · 2025-07-29T07:51:11Z

sklearn/utils/_plotting.py

-            _validate_style_kwargs({"label": label}, curve_kwargs[fold_idx])
+            _validate_style_kwargs(
+                {"label": label, **default_curve_kwargs_}, curve_kwargs[fold_idx]
+            )


I think that it makes more sense. We could argue that it is a fix because we overwrite the full dict instead of a key-value inside.

sklearn/metrics/_plot/precision_recall_curve.py

glemaitre · 2025-07-29T08:28:49Z

sklearn/metrics/_plot/precision_recall_curve.py

+            cv_results["estimator"], cv_results["indices"]["test"]
+        ):
+            y_true = _safe_indexing(y, test_indices)
+            y_pred, pos_label_ = _get_response_values_binary(


I think that it is fine. We will get some expected error.

sklearn/metrics/_plot/precision_recall_curve.py

glemaitre · 2025-07-29T18:52:38Z

Regarding the coverage, it seems that one of the _check_* function was not unindented and thus the test was not run. I fixed the test now. Let see if we are better with the coverage.

glemaitre · 2025-07-30T15:46:06Z

I looked at the tests and they look good. I will now have a look at the example to see where we can introduced this feature.

glemaitre · 2025-07-30T15:50:12Z

I think that we could introduce the .from_cv_results into the plot_precision_recall.py file when we discussed the other methods.

lucyleeow · 2025-09-04T02:04:07Z

Thanks for the review @glemaitre . I've added a small section to introduce from_cv_results to plot_precision_recall.py. I think this is ready for another review 🙏

lucyleeow · 2025-09-18T00:26:45Z

Ping @jeremiedbb maybe you could take a look as Guillaume is busy. Thank you!

wip

4a119a2

lucyleeow marked this pull request as draft December 19, 2024 02:26

github-actions bot added module:metrics module:utils labels Dec 19, 2024

lint

a4858da

lucyleeow commented Dec 19, 2024

View reviewed changes

lucyleeow added 8 commits December 19, 2024 13:43

lint

4106ff2

comment

693d4c2

review meeting

8308c2a

review meeting

eb83623

review changes

e31a446

Merge branch 'main' into from_cv_precisionrecall

3708c94

lint

d66e02f

lint

9568bdb

lucyleeow commented Dec 31, 2024

View reviewed changes

amend warn msg

641552c

lucyleeow mentioned this pull request Dec 31, 2024

ENH add from_cv_results in RocCurveDisplay (single RocCurveDisplay) #30399

Merged

glemaitre self-requested a review January 2, 2025 14:31

lucyleeow added 4 commits June 16, 2025 13:47

merge main

09fe432

first commit post cv results for roc curve merged

8421301

fix tests

ce7b4b2

fix tests

856dab4

lucyleeow mentioned this pull request Jun 18, 2025

Fix RocCurveDisplay docstring and parameter order #31578

Merged

lucyleeow added 2 commits June 18, 2025 15:00

fix common test

2a4b740

add whats new

5adf594

lucyleeow added 4 commits June 19, 2025 15:07

add tests, wip

739a506

more tests

e55d432

remove pos label check

13a093a

fix test_validate_from_cv_results_params

33009c8

lucyleeow commented Jun 20, 2025

View reviewed changes

sklearn/metrics/_plot/tests/test_common_curve_display.py Outdated Show resolved Hide resolved

lucyleeow mentioned this pull request Jun 23, 2025

Check pos_label present in y_true in metric functions #31633

Open

remove pos label check

7ab9eb3

lucyleeow commented Jun 23, 2025

View reviewed changes

lucyleeow added 2 commits July 9, 2025 13:16

Merge branch 'main' into from_cv_precisionrecall

80fec11

Merge branch 'main' into from_cv_precisionrecall

2f1824c

glemaitre self-requested a review July 29, 2025 07:45

Merge remote-tracking branch 'origin/main' into pr/lucyleeow/30508-1

bf69564

glemaitre reviewed Jul 29, 2025

View reviewed changes

glemaitre added 2 commits July 29, 2025 20:47

check average precision score

fc520a1

Merge remote-tracking branch 'origin/main' into pr/lucyleeow/30508-1

bbcf885

lucyleeow added 6 commits August 27, 2025 16:02

Merge branch 'main' into from_cv_precisionrecall

77bc658

Merge branch 'main' into from_cv_precisionrecall

0474b6e

review

5fdfcbc

Merge branch 'main' into from_cv_precisionrecall

3fd25de

add in example

a74246e

fix dep in example

92725d4

lucyleeow mentioned this pull request Sep 21, 2025

ENH Add from_cv_results to DetCurveDisplay #32235

Open

Uh oh!

ENH add from_cv_results in PrecisionRecallDisplay (single Display) #30508

Are you sure you want to change the base?

ENH add from_cv_results in PrecisionRecallDisplay (single Display) #30508

Uh oh!

Conversation

lucyleeow commented Dec 19, 2024

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

github-actions bot commented Dec 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

lucyleeow left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lucyleeow left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lucyleeow commented Jan 14, 2025

Uh oh!

jeremiedbb commented Jun 12, 2025

Uh oh!

lucyleeow commented Jun 13, 2025

Uh oh!

glemaitre commented Jun 13, 2025 • edited by lucyleeow Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lucyleeow commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

pos_label checking in display methods

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lucyleeow commented Jul 15, 2025

Uh oh!

glemaitre commented Jul 29, 2025

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

glemaitre commented Jul 29, 2025

Uh oh!

glemaitre commented Jul 30, 2025

Uh oh!

glemaitre commented Jul 30, 2025

Uh oh!

lucyleeow commented Sep 4, 2025

Uh oh!

lucyleeow commented Sep 18, 2025

Uh oh!

Uh oh!

ENH add `from_cv_results` in `PrecisionRecallDisplay` (single Display) #30508

ENH add `from_cv_results` in `PrecisionRecallDisplay` (single Display) #30508

github-actions bot commented Dec 19, 2024 •

edited

Loading

glemaitre commented Jun 13, 2025 •

edited by lucyleeow

Loading

lucyleeow commented Jun 20, 2025 •

edited

Loading

`pos_label` checking in display methods