What is the difference between cross_val_score with scoring='roc_auc' and roc_auc_score?

Question

I am confused about the difference between the cross_val_score scoring metric 'roc_auc' and the roc_auc_score that I can just import and call directly.

The documentation (http://scikit-learn.org/stable/modules/model_evaluation.html#scoring-parameter) indicates that specifying scoring='roc_auc' will use the sklearn.metrics.roc_auc_score. However, when I implement GridSearchCV or cross_val_score with scoring='roc_auc' I receive very different numbers that when I call roc_auc_score directly.

Here is my code to help demonstrate what I see:

# score the model using cross_val_score

rf = RandomForestClassifier(n_estimators=150,
                            min_samples_leaf=4,
                            min_samples_split=3,
                            n_jobs=-1)

scores = cross_val_score(rf, X, y, cv=3, scoring='roc_auc')

print scores
array([ 0.9649023 ,  0.96242235,  0.9503313 ])

# do a train_test_split, fit the model, and score with roc_auc_score

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)
rf.fit(X_train, y_train)

print roc_auc_score(y_test, rf.predict(X_test))
0.84634039111363313 # quite a bit different than the scores above!

I feel like I am missing something very simple here -- most likely a mistake in how I am implementing/interpreting one of the scoring metrics.

Can anyone shed any light on the reason for the discrepancy between the two scoring metrics?

George Liu · Accepted Answer

This is because you supplied predicted y's instead of the probability in roc_auc_score. This function takes a score, not the classified label. Try instead to do this:

print roc_auc_score(y_test, rf.predict_proba(X_test)[:,1])

It should give a similar result to previous result from cross_val_score. Refer to this post for more info.

Aniket Schneider · Answer

I just ran into a similar issue here. The key takeaway there was that cross_val_score uses the KFold strategy with default parameters for making the train-test splits, which means splits into consecutive chunks rather than shuffling. train_test_split on the other hand does a shuffled split.

The solution is to make the split strategy explicit and specify shuffling, like this:

shuffle = cross_validation.KFold(len(X), n_folds=3, shuffle=True)
scores = cross_val_score(rf, X, y, cv=shuffle, scoring='roc_auc')

What is the difference between cross_val_score with scoring='roc_auc' and roc_auc_score?

Tags:

python

machine-learning

scikit-learn

random-forest

cross-validation

MichaelHood

2 Answers

George Liu

Aniket Schneider

Recent Activity

Donate For Us

What is the difference between cross_val_score with scoring='roc_auc' and roc_auc_score?

Tags:

python

machine-learning

scikit-learn

random-forest

cross-validation

MichaelHood

2 Answers

George Liu

Aniket Schneider

Related questions

Recent Activity

Donate For Us