I have been working to optimize a SVR model in Scikit-Learn, but have been unable to understand how to leverage GridSearchCV.
Consider a slightly modified case of the example code provided in the documentation:
from sklearn import svm, grid_search, datasets
iris = datasets.load_iris()
parameters = {'kernel': ('linear', 'rbf'), 'C':[1.5, 10]}
svr = svm.SVC()
clf = grid_search.GridSearchCV(svr, parameters)
clf.fit(iris.data, iris.target)
clf.get_params()
Since I specify that the search of optimal C values comprises just 1.5 and 10, I would expect the model return to use one of those two values. However, when I look at the output, that does not appear to be the case:
{'cv': None,
'estimator': SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,
kernel='rbf', max_iter=-1, probability=False, random_state=None,
shrinking=True, tol=0.001, verbose=False),
'estimator__C': 1.0,
'estimator__cache_size': 200,
'estimator__class_weight': None,
'estimator__coef0': 0.0,
'estimator__degree': 3,
'estimator__gamma': 0.0,
'estimator__kernel': 'rbf',
'estimator__max_iter': -1,
'estimator__probability': False,
'estimator__random_state': None,
'estimator__shrinking': True,
'estimator__tol': 0.001,
'estimator__verbose': False,
'fit_params': {},
'iid': True,
'loss_func': None,
'n_jobs': 1,
'param_grid': {'C': [1.5, 10], 'kernel': ('linear', 'rbf')},
'pre_dispatch': '2*n_jobs',
'refit': True,
'score_func': None,
'scoring': None,
'verbose': 0}
I suspect I have a fundamental misunderstanding of GridSearchCV how to use it, and what I can expect it to return. I had expected it to return a classifier with optimized parameters based on my search choices, but this does not appear to be the case.
Any guidance would be greatly appreciated.
Thank you very much.
GridSearchCV tries all the combinations of the values passed in the dictionary and evaluates the model for each combination using the Cross-Validation method. Hence after using this function we get accuracy/loss for every combination of hyperparameters and we can choose the one with the best performance.
GridSearchCV is a technique to search through the best parameter values from the given set of the grid of parameters. It is basically a cross-validation method. the model and the parameters are required to be fed in. Best parameter values are extracted and then the predictions are made.
Two generic approaches to parameter search are provided in scikit-learn: for given values, GridSearchCV exhaustively considers all parameter combinations, while RandomizedSearchCV can sample a given number of candidates from a parameter space with a specified distribution.
You should not use get_params
here. use best_params_
or best_estimator_.params
. get_params
gives you back the constructor parameters that you gave it. One of them is estimator, where you gave it an SVC with default parameters, which is what you see here. That has nothing to do with the parameters that are tried in the grid search.
If you look at the examples (look at the bottom of the dev documentation for example) you will never see get_params
used on GridSearchCV - or actually ever, I think ;) It is the interface that defines how GridSearchCV can use other estimators.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With