Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kernel methods for large scale dataset

Kernel-based classifier usually requires O(n^3) training time because of the inner-product computation between two instances. To speed up the training, inner-product values can be pre-computed and stored in a two-dimensional array. However when the no. of instances is very large, say over 100,000, there will not be sufficient memory to do so.

So any better idea for this?

like image 643
developer.cyrus Avatar asked Nov 15 '22 15:11

developer.cyrus


1 Answers

For modern implementations of support vector machines, the scaling of the training algorithm is dependent on lots of factors, such as the nature of the training data and kernel that you are using. The scaling factor of O(n^3) is an analytical result and isn't particularly useful in predicting how SVM training will scale in real-world situations. For example, empirical estimates of the training algorithm used by SVMLight put the scaling against training set size to be approximately O(n^2).

I would suggest you ask this question in the kernel machines forum. I think you're more likely to get a better answer than on Stack Overflow, which is more of a general-purpose programming site.

like image 184
Stompchicken Avatar answered Jan 17 '23 17:01

Stompchicken