You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
According to the definition of precision and recall: http://scikit-learn.org/stable/auto_examples/model_selection/plot_precision_recall.html
when when the Tp=0 and Fp=0 Recall is zero while Precision is undefined (limit 0/0)
In such cases the function precision_recall_curve returns precision equals to 1 that is inconsistent with other functions within sklearn and lead could lead to misleading interpretations. The following code demonstrate the problem:
importnumpyasnpfromsklearn.metricsimportprecision_recall_curve, precision_recall_fscore_support,\
average_precision_score,aucfrommatplotlibimportpyplotasplt#create some example labelsy=np.hstack([np.ones((100,)),np.zeros((900,))])
#create fake uniform probabilities of a dummy classifierprob_random=np.ones_like(y)*.5#Note that when recall is 0 (all samples assigned to class 0) precision returned is 1.#However, this case is mathematically undefined (0/0 limit) and could make sense to set returned #precision to 0 rather than one as it is done in other function within sklearnprecision,recall,_=precision_recall_curve(y,prob_random)
print('precision at 0 recall = {}'.format(precision[1]))
#For example, this function returns precision 0 and recall 0P,R,_,_=precision_recall_fscore_support(y,np.zeros_like(y))
print('precision at 0 recall = {}'.format(P[1]))
#A problem with the first definition is that # the AUC under the precision recall curve of a random classifier appears artificially very goodplt.plot(recall,precision)
recall_def2=[1,0]
precision_def2=[.1,0]
plt.plot(recall_def2,precision_def2)
print('AUC current definition {}'.format(auc(recall,precision)))
print('AUC second definition {}'.format(auc(recall_def2,precision_def2)))
precision at 0 recall = 1.0
precision at 0 recall = 0.0
AUC current definition 0.55
AUC second definition 0.05
Also this definition may create problem when using PR-AUC as CV scorer because classifier with crisp probabilities (such as the random one) will get high scores.
The text was updated successfully, but these errors were encountered:
OK I got double confused here. This is not AUC, AUC is the area under the ROC curve. You're talking about average precision, which is not computed using the auc function and should not be computed that way. calling average_precision_score on your example yields .1.
Uh oh!
There was an error while loading. Please reload this page.
According to the definition of precision and recall:
http://scikit-learn.org/stable/auto_examples/model_selection/plot_precision_recall.html
when when the Tp=0 and Fp=0 Recall is zero while Precision is undefined (limit 0/0)
In such cases the function
precision_recall_curve
returns precision equals to 1 that is inconsistent with other functions within sklearn and lead could lead to misleading interpretations. The following code demonstrate the problem:Also this definition may create problem when using PR-AUC as CV scorer because classifier with crisp probabilities (such as the random one) will get high scores.
The text was updated successfully, but these errors were encountered: