I want to perform GridSearchCV in a SVC model, but that uses the one-vs-all strategy. For the latter part, I can just do this:
model_to_set = OneVsRestClassifier(SVC(kernel="poly"))
My problem is with the parameters. Let's say I want to try the following values:
parameters = {"C":[1,2,4,8], "kernel":["poly","rbf"],"degree":[1,2,3,4]}
In order to perform GridSearchCV, I should do something like:
cv_generator = StratifiedKFold(y, k=10) model_tunning = GridSearchCV(model_to_set, param_grid=parameters, score_func=f1_score, n_jobs=1, cv=cv_generator)
However, then I execute it I get:
Traceback (most recent call last): File "/.../main.py", line 66, in <module> argclass_sys.set_model_parameters(model_name="SVC", verbose=3, file_path=PATH_ROOT_MODELS) File "/.../base.py", line 187, in set_model_parameters model_tunning.fit(self.feature_encoder.transform(self.train_feats), self.label_encoder.transform(self.train_labels)) File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 354, in fit return self._fit(X, y) File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 392, in _fit for clf_params in grid for train, test in cv) File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 473, in __call__ self.dispatch(function, args, kwargs) File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 296, in dispatch job = ImmediateApply(func, args, kwargs) File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 124, in __init__ self.results = func(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 85, in fit_grid_point clf.set_params(**clf_params) File "/usr/local/lib/python2.7/dist-packages/sklearn/base.py", line 241, in set_params % (key, self.__class__.__name__)) ValueError: Invalid parameter kernel for estimator OneVsRestClassifier
Basically, since the SVC is inside a OneVsRestClassifier and that's the estimator I send to the GridSearchCV, the SVC's parameters can't be accessed.
In order to accomplish what I want, I see two solutions:
I'm yet to find a way to do any of the mentioned alternatives. Do you know if there's a way to do any of them? Or maybe you could suggest another way to get to the same result?
Thanks!
What is GridSearchCV? GridSearchCV is a library function that is a member of sklearn's model_selection package. It helps to loop through predefined hyperparameters and fit your estimator (model) on your training set. So, in the end, you can select the best parameters from the listed hyperparameters.
Observing the above time numbers, for parameter grid having 3125 combinations, the Grid Search CV took 10856 seconds (~3 hrs) whereas Halving Grid Search CV took 465 seconds (~8 mins), which is approximate 23x times faster.
param_grid – A dictionary with parameter names as keys and lists of parameter values. 3. scoring – The performance measure. For example, 'r2' for regression models, 'precision' for classification models.
When you use nested estimators with grid search you can scope the parameters with __
as a separator. In this case the SVC model is stored as an attribute named estimator
inside the OneVsRestClassifier
model:
from sklearn.datasets import load_iris from sklearn.multiclass import OneVsRestClassifier from sklearn.svm import SVC from sklearn.grid_search import GridSearchCV from sklearn.metrics import f1_score iris = load_iris() model_to_set = OneVsRestClassifier(SVC(kernel="poly")) parameters = { "estimator__C": [1,2,4,8], "estimator__kernel": ["poly","rbf"], "estimator__degree":[1, 2, 3, 4], } model_tunning = GridSearchCV(model_to_set, param_grid=parameters, score_func=f1_score) model_tunning.fit(iris.data, iris.target) print model_tunning.best_score_ print model_tunning.best_params_
That yields:
0.973290762737 {'estimator__kernel': 'poly', 'estimator__C': 1, 'estimator__degree': 2}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With