I'm trying to fit some models in scikit-learn using grisSearchCV, and I would like to use the "one standard error" rule to select the best model, i.e. selecting the most parsimonious model from the subset of models whose score is within one standard error of the best score. Is there a way to do this?
You can compute the standard error of the mean of the validation scores using:
from scipy.stats import sem
Then access the grid_scores_
attribute of the fitted GridSearchCV
object. This attribute has changed in the master branch of scikit-learn so please use an interactive shell to introspect its structure.
As for selecting the most parsimonious model, the model parameters of the models do not always have a degrees of freedom interpretation. The meaning of the parameters is often model specific and there is no high level metadata to interpret their "parsimony". You can have to encode your interpretation on a case by case basis for each model class.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With