Thanks to visit codestin.com
Credit goes to github.com

Skip to content

roc_auc_score: incorrect result after merging #27412 #30079

@janezd

Description

@janezd

Describe the bug

When all data instances come from the same class, #27412 changed the behaviour of roc_auc_score to return 0.0 instead of raising an exception. The argument for the change was the consistency with PR curves. I believe that this result is incorrect, or, at least, not correct under all interpretations. Even if only the latter: it is not worth breaking backwards compatibility for a change that is a matter of discussion - in particular if the change is masking an error by returning a (dubious) "default".

Arguments

The issue arises when all data instances belong to the same class. While AUC is, literally, the area under the ROC curve, we interpret it as the score reflecting the quality of ranking, which is also related to the Gini index and Mann-Whitney U-statistics, as also described in sklearn documentation.

  • Under geometric interpretation, if all data comes from the same class, the curve may go either straight right or straight up, depending upon the class, so it can be either 0 or 1 (or 0.5), not (necessarily) 0.0.
  • Under statistical interpretation, the AUC is undefined. AUC is the probability that for a random pair of instances from different classes, the score assigned to the instance from the positive class is higher than the score assigned to the instance from the negative class. This measure cannot be computed for data from a single class and is thus undefined. The function should return np.nan or raise an exception (as it used to).
  • Furthermore (and related to the previous point), for any y_true and y_score, it holds that
    auc(y_true, y_score) \
    == auc(1 - y_true, 1 - y_score) \
    == 1 - auc(y_true, 1 - y_score) \
    == 1 - auc(1 - y_true, y_score)

Flipping either labels or scores reverses the curve and the AUC, and flipping both keeps AUC the same. Before #27412, auc_roc_score returned an exception when the result cannot be computed. Now it returns 0.0, which leads to inconsistency when flipping classes or scores (or both).

Suggestion

I suggest reverting the change at https://github.com/scikit-learn/scikit-learn/pull/27412/files#diff-4eb3c023f8a3f088d62208f6adbd02b6df5196de2257ccd228dffc972c964634R375, that is, raising an exception instead of returning an (arbitrary, in some contexts) number. Alternatively, the function could return np.nan, but it is better to have an explicit exception and, above all, to keep the backward compatibility with behaviour that was not wrong.

Steps/Code to Reproduce

from sklearn.metrics import roc_auc_score
import numpy as np

y_true = np.array([1, 1, 1, 1, 1])
y_score = np.array([0.8, 0.6, 0.5, 0.3, 0.2])
print(roc_auc_score(y_true, y_score))

Expected Results

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/janez/miniforge3/envs/o3/lib/python3.11/site-packages/sklearn/utils/_param_validation.py", line 213, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/janez/miniforge3/envs/o3/lib/python3.11/site-packages/sklearn/metrics/_ranking.py", line 640, in roc_auc_score
    return _average_binary_score(
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/janez/miniforge3/envs/o3/lib/python3.11/site-packages/sklearn/metrics/_base.py", line 76, in _average_binary_score
    return binary_metric(y_true, y_score, sample_weight=sample_weight)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/janez/miniforge3/envs/o3/lib/python3.11/site-packages/sklearn/metrics/_ranking.py", line 382, in _binary_roc_auc_score
    raise ValueError(
ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.

Actual Results

0.0

Versions

System:
    python: 3.11.10 | packaged by conda-forge | (main, Sep 10 2024, 10:57:35) [Clang 17.0.6 ]
executable: /Users/janez/miniforge3/envs/o3edge/bin/python
   machine: macOS-14.6.1-arm64-arm-64bit

Python dependencies:
      sklearn: 1.6.dev0
          pip: 24.2
   setuptools: 73.0.1
        numpy: 1.26.4
        scipy: 1.15.0.dev0
       Cython: 3.0.11
       pandas: 3.0.0.dev0+1524.g23c497bb2f
   matplotlib: 3.9.2
       joblib: 1.4.2
threadpoolctl: 3.5.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
    num_threads: 8
         prefix: libopenblas
       filepath: /Users/janez/miniforge3/envs/o3edge/lib/python3.11/site-packages/numpy/.dylibs/libopenblas64_.0.dylib
        version: 0.3.23.dev
threading_layer: pthreads
   architecture: armv8

       user_api: openmp
   internal_api: openmp
    num_threads: 8
         prefix: libomp
       filepath: /Users/janez/miniforge3/envs/o3edge/lib/python3.11/site-packages/sklearn/.dylibs/libomp.dylib
        version: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions