Why is the default value for max_features in RandomForestClassifier different than the one in RandomForestRegressor?

Question

In RandomForestClassifier the default value for max_features is sqrt(n_features) and in RandomForestRegressor it is n_features, any specific reason for that?

Gilles Louppe · Accepted Answer

This is an heuristic based on empirical results. On average, it seems to be a better choice, as a default setting, to set max_features=sqrt(n_features) for classification and max_features=n_features for regression.

This heuristic stems from this paper : http://orbi.ulg.ac.be/bitstream/2268/9357/1/geurts-mlj-advance.pdf

In any case, it is of course always a better idea to cross-validate this parameter.

Why is the default value for max_features in RandomForestClassifier different than the one in RandomForestRegressor?

Tags:

scikit-learn

d1337

1 Answers

Gilles Louppe

Recent Activity

Donate For Us

Why is the default value for max_features in RandomForestClassifier different than the one in RandomForestRegressor?

Tags:

scikit-learn

d1337

1 Answers

Gilles Louppe

Related questions

Recent Activity

Donate For Us