Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

thresholds in roc_curve in scikit learn

I am referring to the below link and sample, and post the plot diagram from this page where I am confused. My confusion is, there are only 4 threshold, but it seems the roc curve has many data points (> 4 data points), wondering how roc_curve working underlying to find more data points?

http://scikit-learn.org/stable/modules/model_evaluation.html#roc-metrics

>>> import numpy as np
>>> from sklearn.metrics import roc_curve
>>> y = np.array([1, 1, 2, 2])
>>> scores = np.array([0.1, 0.4, 0.35, 0.8])
>>> fpr, tpr, thresholds = roc_curve(y, scores, pos_label=2)
>>> fpr
array([ 0. ,  0.5,  0.5,  1. ])
>>> tpr
array([ 0.5,  0.5,  1. ,  1. ])
>>> thresholds
array([ 0.8 ,  0.4 ,  0.35,  0.1 ])

enter image description here

like image 503
Lin Ma Avatar asked Dec 18 '25 22:12

Lin Ma


1 Answers

As HaohanWang mentioned, the parameter 'drop_intermediate' in function roc_curve can drop some suboptimal thresholds for creating lighter ROC curves. (roc_curve).

If set the parameter to be False, all threshold will be displayed, for example: enter image description here

all thresholds and corresponding TPRs and FPRs are calculated, but some of them are useless for plotting the ROC curve.

like image 131
user11376501 Avatar answered Dec 21 '25 13:12

user11376501