In a particular application I was in need of machine learning (I know the things I studied in my undergraduate course). I used Support Vector Machines and got the problem solved. Its working fine. Now I need to improve the system. Problems here are <ol> <li>I get additional training examples every week. Right now the system starts training freshly with updated examples (old examples + new examples). I want to make it incremental learning. Using previous knowledge (instead of previous examples) with new examples to get new model (knowledge)</li> <li>Right my training examples has 3 classes. So, every training example is fitted into one of these 3 classes. I want functionality of "Unknown" class. Anything that doesn't fit these 3 classes must be marked as "unknown". But I can't treat "Unknown" as a new class and provide examples for this too.</li> <li>Assuming, the "unknown" class is implemented. When class is "unknown" the user of the application inputs the what he thinks the class might be. Now, I need to incorporate the user input into the learning. I've no idea about how to do this too. Would it make any difference if the user inputs a new class (i.e.. a class that is not already in the training set)?</li> </ol> Do I need to choose a new algorithm or Support Vector Machines can do this? PS: I'm using libsvm implementation for SVM.

I just wrote my Answer using the same organization as your Question (1., 2., 3). <ol> <li>Can SVMs do this--i.e., incremental learning? Multi-Layer Perceptrons of course can--because the subsequent training instances don't affect the basic network architecture, they'll just cause adjustment in the values of the weight matrices. But SVMs? It seems to me that (in theory) one additional training instance could change the selection of the support vectors. But again, i don't know.</li> <li>I think you can solve this problem quite easily by configuring LIBSVM in one-against-many--i.e., as a one-class classifier. SVMs are one-class classifiers; application of an SVM for multi-class means that it has been coded to perform multiple, step-wise one-against-many classifications, but again the algorithm is trained (and tested) one class at a time. If you do this, then what's left after step-wise execution against the test set, is "unknown"--in other words, whatever data is not classified after performing multiple, sequential one-class classifications, is by definition in that 'unknown' class. </li> <li>Why not make the user's guess a feature (i.e., just another dependent variable)? The only other option is to make it the class label itself, and you don't want that. So you would, for instance, add a column to your data matrix "user class guess", and just populate it with some value most likely to have no effect for those data points not in the 'unknown' category and therefore for which the user will not offer a guess--this value could be '0' or '1', but really it depends on how you have your data scaled and normalized).</li> </ol>

Your first item will likely be the most difficult, since there are essentially no good incremental SVM implementations in existence. A few months ago, I also researched online or incremental SVM algorithms. Unfortunately, the current state of implementations is quite sparse. All I found was a Matlab example, OnlineSVR (a thesis project only implementing regression support), and SVMHeavy (only binary class support). I haven't used any of them personally. They all appear to be at the "research toy" stage. I couldn't even get SVMHeavy to compile. For now, you can probably get away with doing periodic batch training to incorporate updates. I also use LibSVM, and it's quite fast, so it sould be a good substitute until a proper incremental version is implemented. I also don't think SVM's can model the concept of an "unknown" sample by default. They typically work as a series of boolean classifiers, so a sample ends up as positively being classified as something, even if that sample is drastically different from anything seen previously. A possible workaround would be to model the ranges of your features, and randomly generate samples that exist outside of these ranges, and then add these to your training set. For example, if you have an attribute called "color", which has a minimum value of 4 and a maximum value of 123, then you could add these to your training set <pre class="prettyprint"><code>[({'color':3},'unknown'),({'color':125},'unknown')] </code></pre> to give your SVM an idea of what an "unknown" color means.

A few implementation details for a Support-Vector Machine (SVM)

Tags:

machine-learning

svm

libsvm

In a particular application I was in need of machine learning (I know the things I studied in my undergraduate course). I used Support Vector Machines and got the problem solved. Its working fine.

Now I need to improve the system. Problems here are

I get additional training examples every week. Right now the system starts training freshly with updated examples (old examples + new examples). I want to make it incremental learning. Using previous knowledge (instead of previous examples) with new examples to get new model (knowledge)
Right my training examples has 3 classes. So, every training example is fitted into one of these 3 classes. I want functionality of "Unknown" class. Anything that doesn't fit these 3 classes must be marked as "unknown". But I can't treat "Unknown" as a new class and provide examples for this too.
Assuming, the "unknown" class is implemented. When class is "unknown" the user of the application inputs the what he thinks the class might be. Now, I need to incorporate the user input into the learning. I've no idea about how to do this too. Would it make any difference if the user inputs a new class (i.e.. a class that is not already in the training set)?

Do I need to choose a new algorithm or Support Vector Machines can do this?

PS: I'm using libsvm implementation for SVM.

692

asked Aug 10 '10 06:08

claws

2 Answers

I just wrote my Answer using the same organization as your Question (1., 2., 3).

Can SVMs do this--i.e., incremental learning? Multi-Layer Perceptrons of course can--because the subsequent training instances don't affect the basic network architecture, they'll just cause adjustment in the values of the weight matrices. But SVMs? It seems to me that (in theory) one additional training instance could change the selection of the support vectors. But again, i don't know.
I think you can solve this problem quite easily by configuring LIBSVM in one-against-many--i.e., as a one-class classifier. SVMs are one-class classifiers; application of an SVM for multi-class means that it has been coded to perform multiple, step-wise one-against-many classifications, but again the algorithm is trained (and tested) one class at a time. If you do this, then what's left after step-wise execution against the test set, is "unknown"--in other words, whatever data is not classified after performing multiple, sequential one-class classifications, is by definition in that 'unknown' class.
Why not make the user's guess a feature (i.e., just another dependent variable)? The only other option is to make it the class label itself, and you don't want that. So you would, for instance, add a column to your data matrix "user class guess", and just populate it with some value most likely to have no effect for those data points not in the 'unknown' category and therefore for which the user will not offer a guess--this value could be '0' or '1', but really it depends on how you have your data scaled and normalized).

151

answered Sep 20 '22 02:09

doug

Your first item will likely be the most difficult, since there are essentially no good incremental SVM implementations in existence.

A few months ago, I also researched online or incremental SVM algorithms. Unfortunately, the current state of implementations is quite sparse. All I found was a Matlab example, OnlineSVR (a thesis project only implementing regression support), and SVMHeavy (only binary class support).

I haven't used any of them personally. They all appear to be at the "research toy" stage. I couldn't even get SVMHeavy to compile.

For now, you can probably get away with doing periodic batch training to incorporate updates. I also use LibSVM, and it's quite fast, so it sould be a good substitute until a proper incremental version is implemented.

I also don't think SVM's can model the concept of an "unknown" sample by default. They typically work as a series of boolean classifiers, so a sample ends up as positively being classified as something, even if that sample is drastically different from anything seen previously. A possible workaround would be to model the ranges of your features, and randomly generate samples that exist outside of these ranges, and then add these to your training set.

For example, if you have an attribute called "color", which has a minimum value of 4 and a maximum value of 123, then you could add these to your training set

[({'color':3},'unknown'),({'color':125},'unknown')]

to give your SVM an idea of what an "unknown" color means.

answered Sep 20 '22 02:09

Cerin

Related questions
                            
                                Make predictions using a tensorflow graph from a keras model
                            
                                Why neural network predicts wrong on its own training data?
                            
                                How is Elastic Net used?
                            
                                Error in Confusion Matrix : the data and reference factors must have the same number of levels
                            
                                How does binary cross entropy loss work on autoencoders?
                            
                                How to find the features names of the coefficients using scikit linear regression?
                            
                                NLTK for Named Entity Recognition
                            
                                Large scale Machine Learning [closed]
                            
                                Label Smoothing in PyTorch
                            
                                What is the difference between cross-entropy and log loss error?
                            
                                What is the default weight initializer in Keras?
                            
                                How to apply LabelEncoder for a specific column in Pandas dataframe
                            
                                Simple multi layer neural network implementation [closed]
                            
                                Data augmentation in test/validation set?
                            
                                Keep TFIDF result for predicting new content using Scikit for Python
                            
                                ValueError: feature_names mismatch: in xgboost in the predict() function
                            
                                Can't understand the cost function for Linear Regression
                            
                                XGBoost plot_importance doesn't show feature names
                            
                                gradient descent seems to fail
                            
                                How to improve accuracy of Tensorflow camera demo on iOS for retrained graph

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With