GridSearchCV.best_score not same as cross_val_score(GridSearchCV.best_estimator_)

Tags:

Consider the following gridsearch :
grid = GridSearchCV(clf, parameters, n_jobs =-1, iid=True, cv =5)
grid_fit = grid.fit(X_train1, y_train1)

According to Sklearn's ressource, grid_fit.best_score_ returns The mean cross-validated score of the best_estimator .

To me that would mean that the average of :

cross_val_score(grid_fit.best_estimator_, X_train1, y_train1, cv=5)

should be exactly the same as:

grid_fit.best_score_.

However I am getting a 10% difference between the two numbers. What am I missing ?

I am using the gridsearch on proprietary data so I am hoping somebody has run into something similar in the past and can guide me without a fully reproducible example. I will try to reproduce this with the Iris dataset if it's not clear enough...

591

asked Jun 15 '18 17:06

Eric F

1 Answers

when an integer number is passed to GridSearchCV(..., cv=int_number) parameter, then the StratifiedKFold will be used for cross-validation splitting. So the data set will be randomly splitted by StratifiedKFold. This might affect the accuracy and therefore the best score.

120

answered Oct 31 '22 17:10

MaxU - stop WAR against UA

Related questions
                            
                                MemoryError while creating cartesian product in Numpy
                            
                                How to annotate variadic parameters in Python using typing annotations?
                            
                                Maximum limit on number of threads in python
                            
                                How can you re-use a variable scope in tensorflow without a new scope being created by default?
                            
                                How to plot FFT of signal with correct frequencies on x-axis?
                            
                                Django "NULLS LAST" for creating Indexes
                            
                                Loss decreases but weights don't appear to change during tensorflow gradient descent
                            
                                Pandas weird behavior using .replace() to swap values
                            
                                Overwrite directory with shutil.rmtree and os.mkdir sometimes gives 'Access is denied' error
                            
                                How to run Google gsutil using Python
                            
                                Getting same predicted values for all inputs in trained tensor flow network
                            
                                How do I set optional arguments in the conda environment.yml file?
                            
                                How to convert a 1 channel image into a 3 channel with PIL?
                            
                                pyspark equivalence of `df.loc`?
                            
                                In SQLAlchemy, why is my load_only not filtering any columns that I have specified?
                            
                                SVR Model -->Feature Scaling - Expected 2D array, got 1D array instead
                            
                                how to load m4a file in python
                            
                                BrowserMob Proxy Python - How to get response body?
                            
                                How to check if an array is in another array in Python
                            
                                In Fabric 2/Invoke: change directory and use sudo

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

GridSearchCV.best_score not same as cross_val_score(GridSearchCV.best_estimator_)

Tags:

python

scikit-learn

cross-validation

grid-search

Eric F

People also ask

1 Answers

MaxU - stop WAR against UA

Recent Activity

Donate For Us