Documentation page for sklearn random forest says
The only supported criterion is “mse” for the mean squared error.
My data is messy and has outliers and I feel that MAE or some robust penalty function would perform much better.
Is there are a way to fit random forest regressor for other metric, for example iteratively, or is there other python open source alternative, or is my assumption on requiring other metrics wrong on itself? Sklearn is very well developed in other areas, so this seems strange to me that only mse supported for such important approach as random forest.
You can use a GridSearchCV or RandomizedSearchCV to optimize for another criterion in a cross-validation loop. The forests themselves will still optimize for MSE, but the CV loop find the forest among the chosen parameter settings that optimizes the actual criterion that you're interested in. (And it optimizes for CV score, not training set score.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With