Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

optimizing RandomForestRegressor for other metrics

Documentation page for sklearn random forest says

The only supported criterion is “mse” for the mean squared error.

My data is messy and has outliers and I feel that MAE or some robust penalty function would perform much better.

Is there are a way to fit random forest regressor for other metric, for example iteratively, or is there other python open source alternative, or is my assumption on requiring other metrics wrong on itself? Sklearn is very well developed in other areas, so this seems strange to me that only mse supported for such important approach as random forest.

like image 476
trainset Avatar asked May 14 '26 02:05

trainset


1 Answers

You can use a GridSearchCV or RandomizedSearchCV to optimize for another criterion in a cross-validation loop. The forests themselves will still optimize for MSE, but the CV loop find the forest among the chosen parameter settings that optimizes the actual criterion that you're interested in. (And it optimizes for CV score, not training set score.)

like image 62
Fred Foo Avatar answered May 15 '26 16:05

Fred Foo