In simple words, what is the difference between cross-validation and grid search? How does grid search work? Should I do first a cross-validation and then a grid search?
Conclusion: By using cross validation and grid search we were able to have a more meaningful result when compared to our original train/test split with minimal tuning. Cross validation is a very important method used to create better fitting models by training and testing on all parts of the training dataset.
Yes, GridSearchCV performs cross-validation. If I understand the concept correctly - you want to keep part of your data set unseen for the model in order to test it. So you train your models against train data set and test them on a testing data set.
The key difference from grid search is in random search, not all the values are tested and values tested are selected at random. For example, if there are 500 values in the distribution and if we input n_iter=50 then random search will randomly sample 50 values to test.
When people refer to cross validation they generally mean k-fold cross validation. In k-fold cross validation what you do is just that you have multiple(k) train-test sets instead of 1. This basically means that in a k-fold CV you will be training your model k-times and also testing it k-times.
Cross-validation is when you reserve part of your data to use in evaluating your model. There are different cross-validation methods. The simplest conceptually is to just take 70% (just making up a number here, it doesn't have to be 70%) of your data and use that for training, and then use the remaining 30% of the data to evaluate the model's performance. The reason you need different data for training and evaluating the model is to protect against overfitting. There are other (slightly more involved) cross-validation techniques, of course, like k-fold cross-validation, which often used in practice.
Grid search is a method to perform hyper-parameter optimisation, that is, it is a method to find the best combination of hyper-parameters (an example of an hyper-parameter is the learning rate of the optimiser), for a given model (e.g. a CNN) and test dataset. In this scenario, you have several models, each with a different combination of hyper-parameters. Each of these combinations of parameters, which correspond to a single model, can be said to lie on a point of a "grid". The goal is then to train each of these models and evaluate them e.g. using cross-validation. You then select the one that performed best.
To give a concrete example, if you're using a support vector machine, you could use different values for gamma
and C
. So, for example, you could have a grid with the following values for (gamma, C)
: (1, 1), (0.1, 1), (1, 10), (0.1, 10)
. It's a grid because it's like a product of [1, 0.1]
for gamma
and [1, 10]
for C
. Grid-search would basically train a SVM for each of these four pair of (gamma, C)
values, then evaluate it using cross-validation, and select the one that did best.
Cross-validation is a method for robustly estimating test-set performance (generalization) of a model. Grid-search is a way to select the best of a family of models, parametrized by a grid of parameters.
Here, by "model", I don't mean a trained instance, more the algorithms together with the parameters, such as SVC(C=1, kernel='poly')
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With