Closed
Description
Description
The roc_curve Docstring says following
thresholds : array, shape = [n_thresholds] Decreasing thresholds on the decision function used to compute fpr and tpr. *`thresholds[0]` represents no instances being predicted* *and is arbitrarily set to `max(y_score) + 1`.*
But as you may see below, it actually isn't so. If there are any negative-class sample, which was estimated by classifier with a score more than max (y_score)
(comparing to any other positive-class sample), then thresholds[0] will not be set to max(y_score) + 1, and consequently roc_curve will not be started at (0,0) point.
Steps/Code to Reproduce
from sklearn.metrics import roc_curve
actual = np.array([ 0., 0., 1., 1., 1.])
predicted = np.array([ 0.89, 0.61, 0.83, 0.62, 0.90]) # Just change predicted[0] to 0.91 and you'll get right output.
_,_, thresholds = roc_curve(actual, predicted, drop_intermediate=False)
thresholds
Expected Results
array([ 1.89, 0.89, 0.9 , 0.83, 0.62, 0.61]) # Note length of array
Actual Results
array([ 0.9 , 0.89, 0.83, 0.62, 0.61]) # Note length of array
Versions
Linux-4.10.0-33-generic-x86_64-with-debian-stretch-sid
Python 3.6.1 |Anaconda custom (64-bit)| (default, May 11 2017, 13:09:58)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
NumPy 1.12.1
SciPy 0.19.0
Scikit-Learn 0.18.1