Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract info from scikits.learn classifier to then use in C code

I have trained a bunch of RBF SVMs using scikits.learn in Python and then Pickled the results. These are for image processing tasks and one thing I want to do for testing is run each classifier on every pixel of some test images. That is, extract the feature vector from a window centered on pixel (i,j), run each classifier on that feature vector, and then move on to the next pixel and repeat. This is far too slow to do with Python.

Clarification: When I say "this is far too slow..." I mean that even the Libsvm under-the-hood code that scikits.learn uses is too slow. I'm actually writing a manual decision function for the GPU so classification at each pixel happens in parallel.

Is it possible for me to load the classifiers with Pickle, and then grab some kind of attribute that describes how the decision is computed from the feature vector, and then pass that info to my own C code? In the case of linear SVMs, I could just extract the weight vector and bias vector and add those as inputs to a C function. But what is the equivalent thing to do for RBF classifiers, and how do I get that info from the scikits.learn object?

Added: First attempts at a solution.

It looks like the classifier object has the attribute support_vectors_ which contains the support vectors as each row of an array. There is also the attribute dual_coef_ which is a 1 by len(support_vectors_) array of coefficients. From the standard tutorials on non-linear SVMs, it appears then that one should do the following:

  • Compute the feature vector v from your data point under test. This will be a vector that is the same length as the rows of support_vectors_.
  • For each row i in support_vectors_, compute the squared Euclidean distance d[i] between that support vector and v.
  • Compute t[i] as gamma * exp{-d[i]} where gamma is the RBF parameter.
  • Sum up dual_coef_[i] * t[i] over all i. Add the value of the intercept_ attribute of the scikits.learn classifier to this sum.
  • If the sum is positive, classify as 1. Otherwise, classify as 0.

Added: On numbered page 9 at this documentation link it mentions that indeed the intercept_ attribute of the classifier holds the bias term. I have updated the steps above to reflect this.

like image 864
ely Avatar asked Dec 02 '11 17:12

ely


1 Answers

Yes your solution looks alright. To pass the raw memory of a numpy array directly to a C program you can use the ctypes helpers from numpy or wrap you C program with cython and call it directly by passing the numpy array (see the doc at http://cython.org for more details).

However, I am not sure that trying to speedup the prediction on a GPU is the easiest approach: kernel support vector machines are known to be slow at prediction time since their complexity directly depend on the number of support vectors which can be high for highly non-linear (multi-modal) problems.

Alternative approaches that are faster at prediction time include neural networks (probably more complicated or slower to train right than SVMs that only have 2 hyper-parameters C and gamma) or transforming your data with a non linear transformation based on distances to prototypes + thresholding + max pooling over image areas (only for image classification).

  • for the first method you will find good documentation on the deep learning tutorial

  • for the second read the recent papers by Adam Coates and have a look at this page on kmeans feature extraction

Finally you can also try to use NuSVC models whose regularization parameter nu has a direct impact on the number of support vectors in the fitted model: less support vectors mean faster prediction times (check the accuracy though, it will be a trade-off between prediction speed and accuracy in the end).

like image 155
ogrisel Avatar answered Oct 03 '22 11:10

ogrisel