Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is training a random forest regressor with MAE criterion so slow compared to MSE?

When training on even small applications (<50K rows <50 columns) using the mean absolute error criterion for sklearn's RandomForestRegress is nearly 10x slower than using mean squared error. To illustrate even on a small data set:

import time
from sklearn.ensemble import RandomForestRegressor
from sklearn.datasets import load_boston

X, y = load_boston(return_X_y=True)

def fit_rf_criteria(criterion, X=X, y=y):
    reg = RandomForestRegressor(n_estimators=100,
                                criterion=criterion,
                                n_jobs=-1,
                                random_state=1)
    start = time.time()
    reg.fit(X, y)
    end = time.time()
    print(end - start)

fit_rf_criteria('mse')  # 0.13266682624816895
fit_rf_criteria('mae')  # 1.26043701171875

Why does using the 'mae' criterion take so long for training a RandomForestRegressor? I want to optimize MAE for larger applications, but find the speed of the RandomForestRegressor tuned to this criterion prohibitively slow.

like image 714
kevins_1 Avatar asked Jul 28 '19 17:07

kevins_1


People also ask

Is Random Forest slow?

The main limitation of random forest is that a large number of trees can make the algorithm too slow and ineffective for real-time predictions. In general, these algorithms are fast to train, but quite slow to create predictions once they are trained.

What is MSE in random forest?

The two main parameters are mtry and ntree, the number of trees in the forest. We used the mean squared error (abbreviated MSE) as a measure of the prediction accuracy of the RF model. Two MSE error estimates are used in the validation procedure: the OOB error and the cross-validation error.


1 Answers

Thank you @hellpanderr for sharing a reference to the project issue. To summarize – when the random forest regressor optimizes for MSE it optimizes for the L2-norm and a mean-based impurity metric. But when the regressor uses the MAE criterion it optimizes for the L1-norm which amounts to calculating the median. Unfortunately, sklearn's the regressor's implementation for MAE appears to take O(N^2) currently.

like image 56
kevins_1 Avatar answered Oct 02 '22 00:10

kevins_1