Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why isn't DropOut used in Unsupervised Learning?

All or nearly all of the papers using dropout are using it for supervised learning. It seems that it could just as easily be used to regularize deep autoencoders, RBMs and DBNs. So why isn't dropout used in unsupervised learning?

like image 344
MWB Avatar asked Oct 29 '13 18:10

MWB


People also ask

What is not a reason of using dropout?

The reason? Since convolutional layers have few parameters, they need less regularization to begin with. Furthermore, because of the spatial relationships encoded in feature maps, activations can become highly correlated. This renders dropout ineffective.

Should you always use dropout?

You should use dropout for overfitting prevention, especially with a small set of training data. Dropout uses randomness in the training process. The weights are optimized for the general problem instead of for noise in the data.

When should I apply for dropout?

Machine Learning FAQ Typically, dropout is applied after the non-linear activation function (a). However, when using rectified linear units (ReLUs), it might make sense to apply dropout before the non-linear activation (b) for reasons of computational efficiency depending on the particular code implementation.

Can dropout be used in CNN?

We can apply a Dropout layer to the input vector, in which case it nullifies some of its features; but we can also apply it to a hidden layer, in which case it nullifies some hidden neurons. Dropout layers are important in training CNNs because they prevent overfitting on the training data.


2 Answers

Dropout is used in unsupervised learning. For example:

Shuangfei Zhai, Zhongfei Zhang: Dropout Training of Matrix Factorization and Autoencoder for Link Prediction in Sparse Graphs (arxiv, 14 Dec 2015)

like image 186
Martin Thoma Avatar answered Oct 01 '22 09:10

Martin Thoma


Labeled data is relatively scarce, and that's why supervised learning often benefits from strong regularization, like DropOut.

On the other hand, unlabeled data is usually plentiful, and that's why DropOut is typically not needed, and may be detrimental (as it reduces the model capacity).

Even gigantic models like GPT-3 (175e9 parameters) are still underfitting after being updated on 300e9 tokens.

like image 30
MWB Avatar answered Oct 01 '22 09:10

MWB