Is it possible to tune parameters with grid search for custom kernels in scikit-learn?

Tags:

1 Answers

One way to do this is using Pipeline, SVC(kernel='precomputed') and wrapping your custom kernel function as a sklearn estimator (a subclass of BaseEstimator and TransformerMixin)).

For example, sklearn contains a custom kernel function chi2_kernel(X, Y=None, gamma=1.0), which computes the kernel matrix of feature vectors X and Y. This function takes a parameter gamma, which should preferably be set using cross-validation. We can do grid search on the parameters of this function as follows:

Click to copy

from __future__ import print_function
from __future__ import division

import sys

import numpy as np

import sklearn
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.cross_validation import train_test_split
from sklearn.datasets import load_digits
from sklearn.grid_search import GridSearchCV
from sklearn.metrics import accuracy_score
from sklearn.metrics.pairwise import chi2_kernel
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC

# Wrapper class for the custom kernel chi2_kernel
class Chi2Kernel(BaseEstimator,TransformerMixin):
    def __init__(self, gamma=1.0):
        super(Chi2Kernel,self).__init__()
        self.gamma = gamma

    def transform(self, X):
        return chi2_kernel(X, self.X_train_, gamma=self.gamma)

    def fit(self, X, y=None, **fit_params):
        self.X_train_ = X
        return self

def main():

    print('python: {}'.format(sys.version))
    print('numpy: {}'.format(np.__version__))
    print('sklearn: {}'.format(sklearn.__version__))
    np.random.seed(0)

    # Get some data to evaluate
    dataset = load_digits()
    X = dataset.data
    y = dataset.target
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

    # Create a pipeline where our custom predefined kernel Chi2Kernel
    # is run before SVC.
    pipe = Pipeline([
        ('chi2', Chi2Kernel()),
        ('svm', SVC()),
    ])

    # Set the parameter 'gamma' of our custom kernel by
    # using the 'estimator__param' syntax.
    cv_params = dict([
        ('chi2__gamma', 10.0**np.arange(-9,4)),
        ('svm__kernel', ['precomputed']),
        ('svm__C', 10.0**np.arange(-2,9)),
    ])

    # Do grid search to get the best parameter value of 'gamma'.
    model = GridSearchCV(pipe, cv_params, cv=5, verbose=1, n_jobs=-1)
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    acc_test = accuracy_score(y_test, y_pred)

    print("Test accuracy: {}".format(acc_test))
    print("Best params:")
    print(model.best_params_)

if __name__ == '__main__':
    main()

Output:

Click to copy

    python: 2.7.3 (default, Dec 18 2014, 19:10:20)
    [GCC 4.6.3]
    numpy: 1.8.0
    sklearn: 0.16.1
    Fitting 5 folds for each of 143 candidates, totalling 715 fits
    [Parallel(n_jobs=-1)]: Done   1 jobs       | elapsed:    0.4s
    [Parallel(n_jobs=-1)]: Done  50 jobs       | elapsed:    2.7s
    [Parallel(n_jobs=-1)]: Done 200 jobs       | elapsed:    9.8s
    [Parallel(n_jobs=-1)]: Done 450 jobs       | elapsed:   21.6s
    [Parallel(n_jobs=-1)]: Done 701 out of 715 | elapsed:   34.8s remaining:    0.7s
    [Parallel(n_jobs=-1)]: Done 715 out of 715 | elapsed:   35.4s finished
    Test accuracy: 0.989898989899
    Best params:
    {'chi2__gamma': 0.01, 'svm__C': 10.0, 'svm__kernel': 'precomputed'}

In your case, just replace chi2_kernel with your function that computes the kernel matrix.

193

answered Oct 19 '22 06:10

Tommi Kerola

Related questions
                            
                                Can I fill web forms with Scrapy?
                            
                                Javascript: unpack object as function parameters
                            
                                Can Python slicing be used to skip one specific element by index?
                            
                                The `uwsgi_modifier1 30` directive is not removing the SCRIPT_NAME from PATH_INFO as documented
                            
                                Python import as tuple
                            
                                sqlite3.ProgrammingError: Cannot operate on a closed database. [Python] [sqlite]
                            
                                Why are some numpy calls not implemented as methods?
                            
                                Why doesn't `print` work in Python multiprocessing pool.map
                            
                                How to extract sheet from *.xlsm and save it as *.csv in Python?
                            
                                Python SQL Alchemy how to query by excluding selected columns
                            
                                Can't import LoginManager() in Flask
                            
                                Python SyntaxError: Non-ASCII character '\xe2' in file
                            
                                MySQLdb Stored Procedure Out Parameter not working
                            
                                Why can't we **unsplat 'self' into a method? [duplicate]
                            
                                Able to instantiate python class, in spite of it being Abstract (using abc)
                            
                                Python delimited line split problems
                            
                                Getting "error: Unable to find vcvarsall.bat" when running "pip install numpy" on windows7 64bit
                            
                                A very basic setting issue about spyder and anaconda for python
                            
                                matplotlib function conventions: subplots vs one figure
                            
                                Eigen Matrix vs Numpy Array multiplication performance

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is it possible to tune parameters with grid search for custom kernels in scikit-learn?

Tags:

python

scikit-learn

user3733188

People also ask

1 Answers

Tommi Kerola

Recent Activity

Donate For Us