I did grid search + crossvalidation on a SVM with RBF kernel to find optimal value of parameters C and gamma using the class GridShearchCV. Now I would like to get the result in a tabular format like <pre class="prettyprint"><code>C/gamma 1e-3 1e-2 1e3 0.1 0.2 .. 0.3 1 0.9 10 .. 100 .. </code></pre> where cells contain accuracy score for that couple of parameters values. Or at least, if first solution is not possible, something easier like <pre class="prettyprint"><code>C gamma accuracy 0.1 1e-4 0.2 ... </code></pre> I am not very skilled in Python, so I don't know where to start. Could you give me some method to do this kind of representations? The best solution would be to have the table as a plot but also a simple print in console in those formats would be fine. Thank you in advance.

Perhaps easier: <pre class="prettyprint"><code>pd.DataFrame({'param': clf.cv_results_["params"], 'acc': clf.cv_results_["mean_test_score"]}) </code></pre> or: <pre class="prettyprint"><code>df = pd.DataFrame(clf.cv_results_) </code></pre>

Result of GridSearchCV as table

Tags:

python

machine-learning

scikit-learn

gridsearchcv

I did grid search + crossvalidation on a SVM with RBF kernel to find optimal value of parameters C and gamma using the class GridShearchCV. Now I would like to get the result in a tabular format like

C/gamma 1e-3 1e-2 1e3
0.1      0.2  ..  0.3
1        0.9
10       ..   
100      ..

where cells contain accuracy score for that couple of parameters values.

Or at least, if first solution is not possible, something easier like

C    gamma  accuracy
0.1  1e-4      0.2 
...

I am not very skilled in Python, so I don't know where to start. Could you give me some method to do this kind of representations? The best solution would be to have the table as a plot but also a simple print in console in those formats would be fine. Thank you in advance.

412

asked Nov 13 '19 10:11

Gianluca Amprimo

Video Answer

2 Answers

You could make use of the cv_results_ attribute of the gridsearchCV object as shown below:

from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV
iris = datasets.load_iris()
parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}
svc = svm.SVC(gamma="scale")
clf = GridSearchCV(svc, parameters, cv=5)
clf.fit(iris.data, iris.target)

Now you use clf.cv_results_

{'mean_fit_time': array([0.00049248, 0.00051575, 0.00051174, 0.00044131]),
 'mean_score_time': array([0.0002739 , 0.00027657, 0.00023718, 0.00023627]),
 'mean_test_score': array([0.98      , 0.96666667, 0.97333333, 0.98      ]),
 'param_C': masked_array(data=[1, 1, 10, 10],
              mask=[False, False, False, False],
        fill_value='?',
             dtype=object),
 'param_kernel': masked_array(data=['linear', 'rbf', 'linear', 'rbf'],
              mask=[False, False, False, False],
        fill_value='?',
             dtype=object),
 'params': [{'C': 1, 'kernel': 'linear'},
  {'C': 1, 'kernel': 'rbf'},
  {'C': 10, 'kernel': 'linear'},
  {'C': 10, 'kernel': 'rbf'}],
 'rank_test_score': array([1, 4, 3, 1], dtype=int32),
 'split0_test_score': array([0.96666667, 0.96666667, 1.        , 0.96666667]),
 'split1_test_score': array([1.        , 0.96666667, 1.        , 1.        ]),
 'split2_test_score': array([0.96666667, 0.96666667, 0.9       , 0.96666667]),
 'split3_test_score': array([0.96666667, 0.93333333, 0.96666667, 0.96666667]),
 'split4_test_score': array([1., 1., 1., 1.]),
 'std_fit_time': array([1.84329827e-04, 1.34653950e-05, 1.26220210e-04, 1.76294378e-05]),
 'std_score_time': array([6.23956317e-05, 1.34498512e-05, 3.57596078e-06, 4.68175419e-06]),
 'std_test_score': array([0.01632993, 0.02108185, 0.03887301, 0.01632993])}

You can make use of the params and the mean_test_score for constructing the dataframe you are looking using the below command:

pd.concat([pd.DataFrame(clf.cv_results_["params"]),pd.DataFrame(clf.cv_results_["mean_test_score"], columns=["Accuracy"])],axis=1)

And your final dataframe looks like

    C   kernel  Accuracy
0   1   linear  0.980000
1   1   rbf     0.966667
2   10  linear  0.973333
3   10  rbf     0.980000

Hope this helps!

answered Oct 17 '22 22:10

Parthasarathy Subburaj

Perhaps easier:

pd.DataFrame({'param': clf.cv_results_["params"], 'acc': clf.cv_results_["mean_test_score"]})

or:

df = pd.DataFrame(clf.cv_results_)

answered Oct 18 '22 00:10

keramat

Related questions
                            
                                Named Default Arguments in pybind11
                            
                                Numpy: Vectorize np.argwhere
                            
                                TypeError: Object of type 'ndarray' is not JSON serializable
                            
                                Ignore specific logging line temporarily
                            
                                Couldn't find WSGI module deploying Heroku
                            
                                How to Fix Runtime Error: Cannot close a running event loop - Python Discord Bot
                            
                                Jupyter Notebook nbconvert without magic commands/ w/o markdown
                            
                                How do I make a dummy do-nothing @jit decorator?
                            
                                How to fix "Unicode strings with encoding declaration are not supported."
                            
                                Overriding the __str__ method for @classmethods in python
                            
                                How to disable plotly express from grouping bars based on color?
                            
                                What is the fastest way to stack numpy arrays in a loop?
                            
                                How resize images when those converted to numpy array
                            
                                PyCharm weird Type warning [duplicate]
                            
                                How to fix `ResolvePackageNotFound` error when creating Conda environment?
                            
                                how to change the dimensions of a histogram depicted by plt.hist() as figsize is not an argument [duplicate]
                            
                                Is there a way of extracting indices from a pandas DataFrame based on value [duplicate]
                            
                                How to use pyinstaller with pipenv / pyenv
                            
                                How to show Folium map inside a PyQt5 GUI?
                            
                                Fast way to convert upper triangular matrix into symmetric matrix

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With