Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which accuracy score to use for the Mean Decrease Accuracy with the scikit RandomForestClassifier

I've been running the implementation the 'Mean Decrease Accuracy' measure that is shown on this website:

In the example the author is using the random forest regressor RandomForestRegressor, but I am using the random forest classifier RandomForestClassifier. Thus, my question is, if I should also use the r2_score for measuring accuracy or if I should switch to classic accuracy accuracy_score or matthews correlation coefficient matthews_corrcoef?.

Does anybody here if I should switch or not. And why?

Thanks for any help!


Here is the code from the website in case you are too lazy to click :)

from sklearn.cross_validation import ShuffleSplit
from sklearn.metrics import r2_score
from collections import defaultdict

X = boston["data"]
Y = boston["target"]

rf = RandomForestRegressor()
scores = defaultdict(list)

#crossvalidate the scores on a number of different random splits of the data
for train_idx, test_idx in ShuffleSplit(len(X), 100, .3):
    X_train, X_test = X[train_idx], X[test_idx]
    Y_train, Y_test = Y[train_idx], Y[test_idx]
    r = rf.fit(X_train, Y_train)
    acc = r2_score(Y_test, rf.predict(X_test))
    for i in range(X.shape[1]):
        X_t = X_test.copy()
        np.random.shuffle(X_t[:, i])
        shuff_acc = r2_score(Y_test, rf.predict(X_t))
        scores[names[i]].append((acc-shuff_acc)/acc)
print "Features sorted by their score:"
print sorted([(round(np.mean(score), 4), feat) for
              feat, score in scores.items()], reverse=True)
like image 909
dmeu Avatar asked Feb 14 '26 15:02

dmeu


1 Answers

r2_score is for regression (continuous response variable), whereas classic classification (discrete categorical variable) metrics such like accuracy_score and f1_score roc_auc (the last two are most appropriate if you have unbalanced y-labels) are right choices for your task.

Random shuffling each features in the input data matrix and measuring the decline in these classification metrics sounds like a valid approach to rank feature importances.

like image 181
Jianxun Li Avatar answered Feb 16 '26 06:02

Jianxun Li



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!