Kernel methods for large scale dataset

Question

Kernel-based classifier usually requires O(n^3) training time because of the inner-product computation between two instances. To speed up the training, inner-product values can be pre-computed and stored in a two-dimensional array. However when the no. of instances is very large, say over 100,000, there will not be sufficient memory to do so.

So any better idea for this?

Stompchicken · Accepted Answer

For modern implementations of support vector machines, the scaling of the training algorithm is dependent on lots of factors, such as the nature of the training data and kernel that you are using. The scaling factor of O(n^3) is an analytical result and isn't particularly useful in predicting how SVM training will scale in real-world situations. For example, empirical estimates of the training algorithm used by SVMLight put the scaling against training set size to be approximately O(n^2).

I would suggest you ask this question in the kernel machines forum. I think you're more likely to get a better answer than on Stack Overflow, which is more of a general-purpose programming site.

Kernel methods for large scale dataset

Tags:

arrays

machine-learning

classification

computation

developer.cyrus

1 Answers

Stompchicken

Recent Activity

Donate For Us

Kernel methods for large scale dataset

Tags:

arrays

machine-learning

classification

computation

developer.cyrus

1 Answers

Stompchicken

Related questions

Recent Activity

Donate For Us