Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change default RandomForestClassifier's "score" function when fitting the model?

I perform the fitting operation using RandomForestClassifier from sklearn:

clf.fit(X_train,y_train,sample_weight=weight)

I don't know how to change the evaluation metric, which I assume it's simply accuracy here.

I'm asking this because I've seen that with the XGBOOST package you can precisely specify this metric. Example:

clf.fit(X_train, y_train, eval_metric="auc", eval_set=[(X_eval, y_eval)])

So, my question is: could I do the same with RandomForestClassifier from sklearn. I need to base my performance on AUC metric.

like image 743
Guiem Bosch Avatar asked Mar 12 '16 00:03

Guiem Bosch


2 Answers

Well, what I'm doing so far is to wrap the classifier into a GridSearchCV where I can specify the scoring method.

So: GS = grid_search.GridSearchCV(forest_clf, parameters, scoring='roc_auc',verbose=10) works for me.

But I'm open to any suggestions if that's possible to be performed from the classifier itself, or any theoretical explanations if that's not a correct approach.

like image 199
Guiem Bosch Avatar answered Nov 06 '22 16:11

Guiem Bosch


I don't think you can change the metric used by the score method of RandomForestClassifier.

But this code should give you the auc:

from sklearn.metrics import roc_auc_score
roc_auc_score(y_eval, clf.predict_proba(X_eval))
like image 21
Frank Avatar answered Nov 06 '22 17:11

Frank