Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GridSearch for an estimator inside a OneVsRestClassifier

I want to perform GridSearchCV in a SVC model, but that uses the one-vs-all strategy. For the latter part, I can just do this:

model_to_set = OneVsRestClassifier(SVC(kernel="poly")) 

My problem is with the parameters. Let's say I want to try the following values:

parameters = {"C":[1,2,4,8], "kernel":["poly","rbf"],"degree":[1,2,3,4]} 

In order to perform GridSearchCV, I should do something like:

 cv_generator = StratifiedKFold(y, k=10)  model_tunning = GridSearchCV(model_to_set, param_grid=parameters, score_func=f1_score, n_jobs=1, cv=cv_generator) 

However, then I execute it I get:

Traceback (most recent call last):   File "/.../main.py", line 66, in <module>     argclass_sys.set_model_parameters(model_name="SVC", verbose=3, file_path=PATH_ROOT_MODELS)   File "/.../base.py", line 187, in set_model_parameters     model_tunning.fit(self.feature_encoder.transform(self.train_feats), self.label_encoder.transform(self.train_labels))   File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 354, in fit     return self._fit(X, y)   File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 392, in _fit     for clf_params in grid for train, test in cv)   File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 473, in __call__     self.dispatch(function, args, kwargs)   File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 296, in dispatch     job = ImmediateApply(func, args, kwargs)   File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 124, in __init__     self.results = func(*args, **kwargs)   File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 85, in fit_grid_point     clf.set_params(**clf_params)   File "/usr/local/lib/python2.7/dist-packages/sklearn/base.py", line 241, in set_params     % (key, self.__class__.__name__)) ValueError: Invalid parameter kernel for estimator OneVsRestClassifier 

Basically, since the SVC is inside a OneVsRestClassifier and that's the estimator I send to the GridSearchCV, the SVC's parameters can't be accessed.

In order to accomplish what I want, I see two solutions:

  1. When creating the SVC, somehow tell it not to use the one-vs-one strategy but the one-vs-all.
  2. Somehow indicate the GridSearchCV that the parameters correspond to the estimator inside the OneVsRestClassifier.

I'm yet to find a way to do any of the mentioned alternatives. Do you know if there's a way to do any of them? Or maybe you could suggest another way to get to the same result?

Thanks!

like image 399
feralvam Avatar asked Sep 28 '12 02:09

feralvam


People also ask

What is Gridsearch in Sklearn?

What is GridSearchCV? GridSearchCV is a library function that is a member of sklearn's model_selection package. It helps to loop through predefined hyperparameters and fit your estimator (model) on your training set. So, in the end, you can select the best parameters from the listed hyperparameters.

How much time does Gridsearch CV take?

Observing the above time numbers, for parameter grid having 3125 combinations, the Grid Search CV took 10856 seconds (~3 hrs) whereas Halving Grid Search CV took 465 seconds (~8 mins), which is approximate 23x times faster.

What is Param_grid in GridSearchCV?

param_grid – A dictionary with parameter names as keys and lists of parameter values. 3. scoring – The performance measure. For example, 'r2' for regression models, 'precision' for classification models.


1 Answers

When you use nested estimators with grid search you can scope the parameters with __ as a separator. In this case the SVC model is stored as an attribute named estimator inside the OneVsRestClassifier model:

from sklearn.datasets import load_iris from sklearn.multiclass import OneVsRestClassifier from sklearn.svm import SVC from sklearn.grid_search import GridSearchCV from sklearn.metrics import f1_score  iris = load_iris()  model_to_set = OneVsRestClassifier(SVC(kernel="poly"))  parameters = {     "estimator__C": [1,2,4,8],     "estimator__kernel": ["poly","rbf"],     "estimator__degree":[1, 2, 3, 4], }  model_tunning = GridSearchCV(model_to_set, param_grid=parameters,                              score_func=f1_score)  model_tunning.fit(iris.data, iris.target)  print model_tunning.best_score_ print model_tunning.best_params_ 

That yields:

0.973290762737 {'estimator__kernel': 'poly', 'estimator__C': 1, 'estimator__degree': 2} 
like image 58
ogrisel Avatar answered Sep 19 '22 21:09

ogrisel