The way I use sklearn's svm module now, is to use its defaults. However, its not doing particularly well for my dataset. Is it possible to provide a custom loss function , or a custom kernel? If so, what is the way to write such a function so that it matches with what sklearn's svm expects and how to pass such a function to the trainer?
There is this example of how to do it:
SVM custom kernel
code cited here:
def my_kernel(x, y):
"""
We create a custom kernel:
(2 0)
k(x, y) = x ( ) y.T
(0 1)
"""
M = np.array([[2, 0], [0, 1.0]])
return np.dot(np.dot(x, M), y.T)
I'd like to understand the logic behind this kernel. How to choose the kernel matrix? And what exactly is y.T
?
To answer your question, unless you have a very good idea of why you want to define a custom kernel, I'd stick with the built-ins. They are very fast, flexible, and powerful, and are well-suited to most applications.
That being said, let's go into a bit more detail:
A Kernel Function is a special kind of measure of similarity between two points. Basically a larger value of the similarity means the points are more similar. The scikit-learn SVM is designed to be able to work with any kernel function. Several kernels built-in (e.g. linear, radial basis function, polynomial, sigmoid) but you can also define your own.
Your custom kernel function should look something like this:
def my_kernel(x, y):
"""Compute My Kernel
Parameters
----------
x : array, shape=(N, D)
y : array, shape=(M, D)
input vectors for kernel similarity
Returns
-------
K : array, shape=(N, M)
matrix of similarities between x and y
"""
# ... compute something here ...
return similarity_matrix
The most basic kernel, a linear kernel, would look like this:
def linear_kernel(x, y):
return np.dot(x, y.T)
Equivalently, you can write
def linear_kernel_2(x, y):
M = np.array([[1, 0],
[0, 1]])
return np.dot(x, np.dot(M, y.T))
The matrix M
here defines the so-called inner product space in which the kernel acts. This matrix can be modified to define a new inner product space; the custom function from the example you linked to just modifies M
to effectively double the importance of the first dimension in determining the similarity.
More complicated non-linear modifications are possible as well, but you have to be careful: kernel functions must meet certain requirements (they must satisfy the properties of an inner-product space) or the SVM algorithm will not work correctly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With