What is the difference between cross-validation and grid search?

2 Answers

Cross-validation is when you reserve part of your data to use in evaluating your model. There are different cross-validation methods. The simplest conceptually is to just take 70% (just making up a number here, it doesn't have to be 70%) of your data and use that for training, and then use the remaining 30% of the data to evaluate the model's performance. The reason you need different data for training and evaluating the model is to protect against overfitting. There are other (slightly more involved) cross-validation techniques, of course, like k-fold cross-validation, which often used in practice.

Grid search is a method to perform hyper-parameter optimisation, that is, it is a method to find the best combination of hyper-parameters (an example of an hyper-parameter is the learning rate of the optimiser), for a given model (e.g. a CNN) and test dataset. In this scenario, you have several models, each with a different combination of hyper-parameters. Each of these combinations of parameters, which correspond to a single model, can be said to lie on a point of a "grid". The goal is then to train each of these models and evaluate them e.g. using cross-validation. You then select the one that performed best.

To give a concrete example, if you're using a support vector machine, you could use different values for gamma and C. So, for example, you could have a grid with the following values for (gamma, C): (1, 1), (0.1, 1), (1, 10), (0.1, 10). It's a grid because it's like a product of [1, 0.1] for gamma and [1, 10] for C. Grid-search would basically train a SVM for each of these four pair of (gamma, C) values, then evaluate it using cross-validation, and select the one that did best.

answered Oct 15 '22 02:10

Or Neeman

Cross-validation is a method for robustly estimating test-set performance (generalization) of a model. Grid-search is a way to select the best of a family of models, parametrized by a grid of parameters.

Here, by "model", I don't mean a trained instance, more the algorithms together with the parameters, such as SVC(C=1, kernel='poly').

answered Oct 15 '22 00:10

Andreas Mueller

Related questions
                            
                                (Python - sklearn) How to pass parameters to the customize ModelTransformer class by gridsearchcv
                            
                                Sklearn preprocessing - PolynomialFeatures - How to keep column names/headers of the output array / dataframe
                            
                                scikit-learn GridSearchCV with multiple repetitions
                            
                                How to perform k-fold cross validation with tensorflow?
                            
                                How is scikit-learn cross_val_predict accuracy score calculated?
                            
                                How to extract best parameters from a CrossValidatorModel
                            
                                predict_proba for a cross-validated model
                            
                                What is OOF approach in machine learning?
                            
                                Using statsmodel estimations with scikit-learn cross validation, is it possible?
                            
                                Difference between cross_val_score and cross_val_predict
                            
                                How to use the a k-fold cross validation in scikit with naive bayes classifier and NLTK
                            
                                Does TensorFlow have cross validation implemented for its users?
                            
                                how to implement walk forward testing in sklearn?
                            
                                Early stopping with Keras and sklearn GridSearchCV cross-validation
                            
                                How to extract model hyper-parameters from spark.ml in PySpark?
                            
                                Sklearn StratifiedKFold: ValueError: Supported target types are: ('binary', 'multiclass'). Got 'multilabel-indicator' instead
                            
                                How to split data on balanced training set and test set on sklearn
                            
                                module 'sklearn' has no attribute 'cross_validation'
                            
                                Using explicit (predefined) validation set for grid search with sklearn
                            
                                How to get Best Estimator on GridSearchCV (Random Forest Classifier Scikit)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the difference between cross-validation and grid search?

Tags:

definition

difference

cross-validation

grid-search

Linda

People also ask

2 Answers

Or Neeman

Andreas Mueller

Recent Activity

Donate For Us