I understand the differences between supervised and unsupervised learning: Supervised Learning is a way of "teaching" the classifier, using labeled data. Unsupervised Learning lets the classifier "learn by itself", for example, using clustering. But what is "weakly supervised learning"? How does it classify its examples?

<h3>Updated answer</h3> As several comments below mention, the situation is not as simple as I originally wrote in 2013. The generally accepted view is that <ul> <li> weak supervision - supervision with noisy labels (wikipedia)</li> <li> semi supervision - only a subset of training data has labels (wikipedia)</li> </ul> There are also classifications that are more along with my original answer, for example, Zhi-Hua Zhou's 2017 A brief introduction to weakly supervised learning considers weak supervision to be an umbrella term for <ul> <li> incomplete supervision - only a subset of training data has labels (same as above)</li> <li> inexact supervision - called where the training data are given with only coarse-grained labels</li> <li> inaccurate supervision - where the given labels are not always ground-truth (weak supervision above).</li> </ul> <hr> <h3>Original answer</h3> In short: In weakly supervised learning, you use a limited amount of labeled data. How you select this data, and what exactly you do with it depends on the method. In general you use a limited number of data that is easy to get and/or makes a real difference and then learn the rest. I consider bootstrapping to be a method that can be used in weakly supervised learning, but as the comment by Ben below shows, this is not a generally accepted view. See, for example Chris Bieman's 2007 dissertation for a nice overview, it says the following about bootstrapping/weakly-supervised learning: <blockquote> Bootstrapping, also called self-training, is a form of learning that is designed to use even less training examples, therefore sometimes called weakly-supervised. Bootstrapping starts with a few training examples, trains a classifier, and uses thought-to-be positive examples as yielded by this classifier for retraining. As the set of training examples grows, the classifier improves, provided that not too many negative examples are misclassified as positive, which could lead to deterioration of performance. </blockquote> For example, in case of part-of-speech tagging, one usually trains an HMM (or maximum-entropy or whatever) tagger on 10,000's words, each with it's POS. In the case of weakly supervised tagging, you might simply use a very small corpus of 100s words. You get some tagger, you use it to tag a corpus of 1000's words, you train a tagger on that and use it to tag even bigger corpus. Obviously, you have to be smarter than this, but this is a good start. (See this paper for a more advance example of a bootstrapped tagger) Note: weakly supervised learning can also refer to learning with noisy labels (such labels can but do not need to be the result of bootstrapping)

What is weakly supervised learning (bootstrapping)?

1 Answers

Updated answer

As several comments below mention, the situation is not as simple as I originally wrote in 2013.

The generally accepted view is that

weak supervision - supervision with noisy labels (wikipedia)
semi supervision - only a subset of training data has labels (wikipedia)

There are also classifications that are more along with my original answer, for example, Zhi-Hua Zhou's 2017 A brief introduction to weakly supervised learning considers weak supervision to be an umbrella term for

incomplete supervision - only a subset of training data has labels (same as above)
inexact supervision - called where the training data are given with only coarse-grained labels
inaccurate supervision - where the given labels are not always ground-truth (weak supervision above).

Original answer

In short: In weakly supervised learning, you use a limited amount of labeled data.

How you select this data, and what exactly you do with it depends on the method. In general you use a limited number of data that is easy to get and/or makes a real difference and then learn the rest. I consider bootstrapping to be a method that can be used in weakly supervised learning, but as the comment by Ben below shows, this is not a generally accepted view.

See, for example Chris Bieman's 2007 dissertation for a nice overview, it says the following about bootstrapping/weakly-supervised learning:

Bootstrapping, also called self-training, is a form of learning that is designed to use even less training examples, therefore sometimes called weakly-supervised. Bootstrapping starts with a few training examples, trains a classifier, and uses thought-to-be positive examples as yielded by this classifier for retraining. As the set of training examples grows, the classifier improves, provided that not too many negative examples are misclassified as positive, which could lead to deterioration of performance.

For example, in case of part-of-speech tagging, one usually trains an HMM (or maximum-entropy or whatever) tagger on 10,000's words, each with it's POS. In the case of weakly supervised tagging, you might simply use a very small corpus of 100s words. You get some tagger, you use it to tag a corpus of 1000's words, you train a tagger on that and use it to tag even bigger corpus. Obviously, you have to be smarter than this, but this is a good start. (See this paper for a more advance example of a bootstrapped tagger)

Note: weakly supervised learning can also refer to learning with noisy labels (such labels can but do not need to be the result of bootstrapping)

121

answered Sep 19 '22 14:09

Jirka

Related questions
                            
                                Keep TFIDF result for predicting new content using Scikit for Python
                            
                                ValueError: feature_names mismatch: in xgboost in the predict() function
                            
                                Can't understand the cost function for Linear Regression
                            
                                XGBoost plot_importance doesn't show feature names
                            
                                gradient descent seems to fail
                            
                                How to improve accuracy of Tensorflow camera demo on iOS for retrained graph
                            
                                A few implementation details for a Support-Vector Machine (SVM)
                            
                                How does one train multiple models in a single script in TensorFlow when there are GPUs present?
                            
                                Pointwise mutual information on text
                            
                                Is F1 micro the same as Accuracy?
                            
                                How to use fit_generator with multiple inputs
                            
                                Save python random forest model to file
                            
                                How to duplicate an estimator in order to use it on multiple data sets?
                            
                                How to get a classifier's confidence score for a prediction in sklearn?
                            
                                Printing all the contents of a tensor
                            
                                Choosing from different cost function and activation function of a neural network
                            
                                Using Smote with Gridsearchcv in Scikit-learn
                            
                                Soft attention vs. hard attention
                            
                                What's the difference between LibSVM and LibLinear
                            
                                Is it possible to do multivariate multi-step forecasting using FB Prophet?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is weakly supervised learning (bootstrapping)?

Tags:

machine-learning

classification

Cheshie

People also ask

1 Answers

Updated answer

Original answer

Jirka

Recent Activity

Donate For Us