Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest SVM implementation usable in Python [closed]

I'm building some predictive models in Python and have been using scikits learn's SVM implementation. It's been really great, easy to use, and relatively fast.

Unfortunately, I'm beginning to become constrained by my runtime. I run a rbf SVM on a full dataset of about 4 - 5000 with 650 features. Each run takes about a minute. But with a 5 fold cross validation + grid search (using a coarse to fine search), it's getting a bit unfeasible for my task at hand. So generally, do people have any recommendations in terms of the fastest SVM implementation that can be used in Python? That, or any ways to speed up my modeling?

I've heard of LIBSVM's GPU implementation, which seems like it could work. I don't know of any other GPU SVM implementations usable in Python, but it would definitely be open to others. Also, does using the GPU significantly increase runtime?

I've also heard that there are ways of approximating the rbf SVM by using a linear SVM + feature map in scikits. Not sure what people think about this approach. Again, anyone using this approach, is it a significant increase in runtime?

All ideas for increasing the speed of program is most welcome.

like image 743
tomas Avatar asked Feb 15 '12 18:02

tomas


People also ask

Why is SVM so slow?

The most likely explanation is that you're using too many training examples for your SVM implementation. SVMs are based around a kernel function. Most implementations explicitly store this as an NxN matrix of distances between the training points to avoid computing entries over and over again.

Can SVM use GPU?

scikit-svm will never support GPU.


1 Answers

The most scalable kernel SVM implementation I know of is LaSVM. It's written in C hence wrap-able in Python if you know Cython, ctypes or cffi. Alternatively you can use it from the command line. You can use the utilities in sklearn.datasets to load convert data from a NumPy or CSR format into svmlight formatted files that LaSVM can use as training / test set.

like image 62
ogrisel Avatar answered Sep 24 '22 19:09

ogrisel