Function to get scorers for task


I would like to see a utility which would construct a set of applicable scorers for a particular task, returning a Mapping from string to callable scorer. It will be hard to design the API of this right the first time. [Maybe this should be initially developed outside this project and contributed to scikit-learn-contrib, but I think it reduces risk of mis-specifying scorers, so it's of benefit to this project.]

The user will be able to select a subset of the scorers, either with a dict comprehension or with some specialised methods or function parameters. Initially it wouldn't be efficient to run all these scorers, but hopefully we can do something to fix #10802 :|.

Let's take for instance a binary classification task. The function `get_applicable_scorers(y, pos_label='yes')` for binary `y` might produce something like:
```
{
    'accuracy': make_scorer(accuracy_score),
    'balanced_accuracy': make_scorer(balanced_accuracy_score),
    'matthews_corrcoef': make_scorer(matthews_corrcoef),
    'cohens_kappa': make_scorer(cohens_kappa),
    'precision': make_scorer(precision_score, pos_label='yes'),
    'recall': make_scorer(recall_score, pos_label='yes'),
    'f1': make_scorer(f1_score, pos_label='yes'),
    'f0.5': make_scorer(f1_score, pos_label='yes', beta=0.5),
    'f2': make_scorer(f1_score, pos_label='yes', beta=2),
    'specificity': ...,
    'miss_rate': ...,
    ...
    'roc_auc': make_scorer(roc_auc_score, needs_threshold=True),
    'average_precision': make_scorer(average_precision_score, needs_threshold=True),
    'neg_log_loss': make_scorer(average_precision_score, needs_proba=True, greater_is_better=False),
    'neg_brier_score_loss': make_scorer(average_precision_score, needs_proba=True, greater_is_better=False),
}
```

Doing the same for multiclass classification would pass `labels` as appropriate, and would optionally would get per-class binary metrics, as well as overall multiclass metrics.

I'm not sure how `sample_weight` fits in here, but ha! we still don't support weighted scoring in cross validation (#1574), so let's not worry about that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Function to get scorers for task #12385

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Function to get scorers for task #12385

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions