Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between cross_val_score with scoring='roc_auc' and roc_auc_score?

I am confused about the difference between the cross_val_score scoring metric 'roc_auc' and the roc_auc_score that I can just import and call directly.

The documentation (http://scikit-learn.org/stable/modules/model_evaluation.html#scoring-parameter) indicates that specifying scoring='roc_auc' will use the sklearn.metrics.roc_auc_score. However, when I implement GridSearchCV or cross_val_score with scoring='roc_auc' I receive very different numbers that when I call roc_auc_score directly.

Here is my code to help demonstrate what I see:

# score the model using cross_val_score

rf = RandomForestClassifier(n_estimators=150,
                            min_samples_leaf=4,
                            min_samples_split=3,
                            n_jobs=-1)

scores = cross_val_score(rf, X, y, cv=3, scoring='roc_auc')

print scores
array([ 0.9649023 ,  0.96242235,  0.9503313 ])

# do a train_test_split, fit the model, and score with roc_auc_score

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)
rf.fit(X_train, y_train)

print roc_auc_score(y_test, rf.predict(X_test))
0.84634039111363313 # quite a bit different than the scores above!

I feel like I am missing something very simple here -- most likely a mistake in how I am implementing/interpreting one of the scoring metrics.

Can anyone shed any light on the reason for the discrepancy between the two scoring metrics?

like image 618
MichaelHood Avatar asked Nov 11 '15 00:11

MichaelHood


2 Answers

This is because you supplied predicted y's instead of the probability in roc_auc_score. This function takes a score, not the classified label. Try instead to do this:

print roc_auc_score(y_test, rf.predict_proba(X_test)[:,1])

It should give a similar result to previous result from cross_val_score. Refer to this post for more info.

like image 168
George Liu Avatar answered Nov 11 '22 15:11

George Liu


I just ran into a similar issue here. The key takeaway there was that cross_val_score uses the KFold strategy with default parameters for making the train-test splits, which means splits into consecutive chunks rather than shuffling. train_test_split on the other hand does a shuffled split.

The solution is to make the split strategy explicit and specify shuffling, like this:

shuffle = cross_validation.KFold(len(X), n_folds=3, shuffle=True)
scores = cross_val_score(rf, X, y, cv=shuffle, scoring='roc_auc')
like image 28
Aniket Schneider Avatar answered Nov 11 '22 17:11

Aniket Schneider