Thanks to visit codestin.com
Credit goes to github.com

Skip to content

roc_curve doesn't always arbitrarily set thresholds[0] to max(y_score) + 1 #9790

Closed
@alexryndin

Description

@alexryndin

Description

The roc_curve Docstring says following

thresholds : array, shape = [n_thresholds]
    Decreasing thresholds on the decision function used to compute
    fpr and tpr. *`thresholds[0]` represents no instances being predicted*
    *and is arbitrarily set to `max(y_score) + 1`.*

But as you may see below, it actually isn't so. If there are any negative-class sample, which was estimated by classifier with a score more than max (y_score) (comparing to any other positive-class sample), then thresholds[0] will not be set to max(y_score) + 1, and consequently roc_curve will not be started at (0,0) point.

Steps/Code to Reproduce

from sklearn.metrics import roc_curve 
actual = np.array([ 0.,  0., 1.,  1.,  1.])
predicted = np.array([ 0.89, 0.61,  0.83, 0.62,  0.90]) # Just change predicted[0] to 0.91 and you'll get right output.
_,_, thresholds = roc_curve(actual, predicted, drop_intermediate=False)
thresholds

Expected Results

array([ 1.89,  0.89,  0.9 ,  0.83,  0.62,  0.61]) # Note length of array

Actual Results

array([ 0.9 ,  0.89,  0.83,  0.62,  0.61]) # Note length of array

Versions

Linux-4.10.0-33-generic-x86_64-with-debian-stretch-sid
Python 3.6.1 |Anaconda custom (64-bit)| (default, May 11 2017, 13:09:58) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
NumPy 1.12.1
SciPy 0.19.0
Scikit-Learn 0.18.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugEasyWell-defined and straightforward way to resolvehelp wanted

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions