I wrote following code and test it on small data:
classif = OneVsRestClassifier(svm.SVC(kernel='rbf'))
classif.fit(X, y)
Where X, y
(X - 30000x784 matrix, y - 30000x1) are numpy arrays. On small data algorithm works well and give me right results.
But I run my program about 10 hours ago... And it is still in process.
I want to know how long it will take, or it stuck in some way? (Laptop specs 4 GB Memory, Core i5-480M)
Even the prediction time is polynomial in terms of number of test vectors. If you really must use SVM then I'd recommend using GPU speed up or reducing the training dataset size. Try with a sample (10,000 rows maybe) of the data first to see whether it's not an issue with the data format or distribution.
Train SVM models up to 143 times faster. Do inference up to 600 times faster. Get the same quality of predictions as other tested frameworks. You get all of this without have to change your code or hardware.
The SVM classifier is a supervised classification method. It is well suited for segmented raster input but can also handle standard imagery. It is a classification method commonly used in the research community.
The implementation is based on libsvm. The fit time complexity is more than quadratic with the number of samples which makes it hard to scale to dataset with more than a couple of 10000 samples. If you do not want to use kernels, and a linear SVM suffices, there is LinearSVC.
SVM training can be arbitrary long, this depends on dozens of parameters:
C
parameter - greater the missclassification penalty, slower the processin general, basic SMO algorithm is O(n^3)
, so in case of 30 000
datapoints it has to run number of operations proportional to the2 700 000 000 000
which is realy huge number. What are your options?
C
parameterIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With