Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to define specificity as a callable scorer for model evaluation

I am using this code to compare performance of a number of models:

from sklearn import model_selection

X = input data
Y = binary labels

models = []
models.append(('LR', LogisticRegression()))
models.append(('LDA', LinearDiscriminantAnalysis()))
models.append(('KNN', KNeighborsClassifier()))
models.append(('CART', DecisionTreeClassifier()))

results = []
names = []
scoring = 'accuracy'

for name, model in models:
    kfold = model_selection.KFold(n_splits=10, random_state=7)
    cv_results = model_selection.cross_val_score(model, X, Y, cv=kfold,scoring=scoring)
    results.append(cv_results)
    names.append(name)
    msg = "%s: %.2f (%.2f)" % (name, cv_results.mean(), cv_results.std())
    print(msg)

I can use 'accuracy' and 'recall' as scoring and these will give accuracy and sensitivity. How can I create a scorer that gives me 'specificity'

Specificity= TN/(TN+FP)

where TN, and FP are true negative and false positive values in the confusion matrix

I have tried this

def tp(y_true, y_pred): 
error= confusion_matrix(y_true, y_pred)[0,0]/(confusion_matrix(y_true,y_pred)[0,0] + confusion_matrix(y_true, y_pred)[0,1])
return error

my_scorer = make_scorer(tp, greater_is_better=True)

and then

cv_results = model_selection.cross_val_score(model, X,Y,cv=kfold,scoring=my_scorer)

but it will not work for n_split >=10 I get this error for calculation of my_scorer

IndexError: index 1 is out of bounds for axis 1 with size 1

like image 855
J Joe Avatar asked Dec 07 '17 21:12

J Joe


1 Answers

If you change the recall_score parameters for a binary classifier to pos_label=0 you get specificity (default is sensitivity, pos_label=1)

scoring = {
    'accuracy': make_scorer(accuracy_score),
    'sensitivity': make_scorer(recall_score),
    'specificity': make_scorer(recall_score,pos_label=0)
}
like image 114
Benjamin Tehan Avatar answered Sep 28 '22 10:09

Benjamin Tehan