More than one estimator in GridSearchCV(sklearn)

Tags:

I was checking sklearn documentation webpage about GridSearchCV. One of attributes of GridSearchCV object is best_estimator_. So here is my question. How to pass more than one estimator to GSCV object?

Using a dictionary like: {'SVC()':{'C':10, 'gamma':0.01}, ' DecTreeClass()':{....}}?

532

asked Aug 01 '18 08:08

mikinoqwert

1 Answers

GridSearchCV works on parameters. It will train multiple estimators (but same class (one of SVC, or DecisionTreeClassifier, or other classifiers) with different parameter combinations from specified in param_grid. best_estimator_ is the estimator which performs best on the data.

So essentially best_estimator_ is the same class object initialized with best found params.

So in the basic setup you cannot use multiple estimators in the grid-search.

But as a workaround, you can have multiple estimators when using a pipeline in which the estimator is a "parameter" which the GridSearchCV can set.

Something like this:

from sklearn.pipeline import Pipeline
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_iris
iris_data = load_iris()
X, y = iris_data.data, iris_data.target


# Just initialize the pipeline with any estimator you like    
pipe = Pipeline(steps=[('estimator', SVC())])

# Add a dict of estimator and estimator related parameters in this list
params_grid = [{
                'estimator':[SVC()],
                'estimator__C': [1, 10, 100, 1000],
                'estimator__gamma': [0.001, 0.0001],
                },
                {
                'estimator': [DecisionTreeClassifier()],
                'estimator__max_depth': [1,2,3,4,5],
                'estimator__max_features': [None, "auto", "sqrt", "log2"],
                },
               # {'estimator':[Any_other_estimator_you_want],
               #  'estimator__valid_param_of_your_estimator':[valid_values]

              ]

grid = GridSearchCV(pipe, params_grid)

You can add as many dicts inside the list of params_grid as you like, but make sure that each dict have compatible parameters related to the 'estimator'.

answered Sep 23 '22 12:09

Vivek Kumar

Related questions
                            
                                How to connect HBase and Spark using Python?
                            
                                np_utils.to_categorical Reverse
                            
                                Python Matplotlib FuncAnimation.save() only saves 100 frames
                            
                                How to boost a Keras based neural network using AdaBoost?
                            
                                Python error: "socket.error: [Errno 11] Resource temporarily unavailable" when sending image
                            
                                Pandas: create dataframe without auto ordering column names alphabetically
                            
                                Sequentially read huge CSV file in python
                            
                                Pandas missing x tick labels [duplicate]
                            
                                Generate sql with subquery as a column in select statement using SQLAlchemy
                            
                                What is the explicit python3 type for dict_keys for isinstance() check?
                            
                                what does `yield from asyncio.sleep(delay)` do?
                            
                                how to get the name of column with maximum value in pyspark dataframe
                            
                                Swapping rows within the same pandas dataframe
                            
                                Why is my Protobuf message (in Python) ignoring zero values?
                            
                                Scatter plot with colormap makes X-axis disappear
                            
                                Efficiently download files asynchronously with requests
                            
                                Django REST: Uploading and serializing multiple images
                            
                                Python splitting list to sublists at given start/end keywords
                            
                                How to run a cron job with pipenv?
                            
                                PyTorch: Testing with torchvision.datasets.ImageFolder and DataLoader

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

More than one estimator in GridSearchCV(sklearn)

Tags:

python

scikit-learn

grid-search

mikinoqwert

People also ask

1 Answers

Vivek Kumar

Recent Activity

Donate For Us