Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

One standard error rule for cross-validation in scikit-learn

Tags:

scikit-learn

I'm trying to fit some models in scikit-learn using grisSearchCV, and I would like to use the "one standard error" rule to select the best model, i.e. selecting the most parsimonious model from the subset of models whose score is within one standard error of the best score. Is there a way to do this?

like image 309
user2212589 Avatar asked Oct 22 '22 13:10

user2212589


1 Answers

You can compute the standard error of the mean of the validation scores using:

from scipy.stats import sem

Then access the grid_scores_ attribute of the fitted GridSearchCV object. This attribute has changed in the master branch of scikit-learn so please use an interactive shell to introspect its structure.

As for selecting the most parsimonious model, the model parameters of the models do not always have a degrees of freedom interpretation. The meaning of the parameters is often model specific and there is no high level metadata to interpret their "parsimony". You can have to encode your interpretation on a case by case basis for each model class.

like image 189
ogrisel Avatar answered Jan 02 '23 20:01

ogrisel