Why word2vec doesn't use regularization?

Tags:

ML models with huge number of parameters will tend to overfit (since they have a large variance). In my opinion, word2vec is one such models. One of the ways to reduce the model variance is to apply a regularization technique, which is very common thing for the other embedding models, such as matrix factorization. However, the basic version of word2vec doesn't have any regularization part. Is there a reason for this?

744

asked Jan 15 '18 15:01

Tural Gurbanov

1 Answers

That's an interesting question.

I'd say that overfitting in Word2Vec doesn't make a lot of sense, because the goal of word embeddings to match the word occurrence distribution as exactly as possible. Word2Vec is not designed to learn anything outside of the training vocabulary, i.e., generalize, but to approximate the one distribution defined by the text corpus. In this sense, Word2Vec is actually trying to fit exactly, so it can't over-fit.

If you had a small vocabulary, it'd be possible to compute the co-occurrence matrix and find the exact global minimum for the embeddings (of a given size), i.e., get the perfect fit and that would define the best contextual word model for this fixed language.

192

answered Oct 12 '22 03:10

Maxim

Related questions
                            
                                R + ggplot : how to use a custom smoother (Gaussian Process)
                            
                                Is train/test-Split in unsupervised learning necessary/useful?
                            
                                How to get accuracy precision, recall and ROC from cross validation in Spark ml lib?
                            
                                Apple Vision image recognition
                            
                                Performance decrease for huge amount of columns. Pyspark
                            
                                How to do supervised deepbelief training in PyBrain?
                            
                                Restricted Boltzmann Machine for real-valued data - gaussian linear units (glu) -
                            
                                How to find timber in a truck using MATLAB?
                            
                                how to convert saved model from sklearn into tensorflow/lite
                            
                                Recommender: Log user actions & datamine it – good solution [closed]
                            
                                Assurance of ICP, internal Metrics
                            
                                Audio signal source separation with neural network
                            
                                Reproduce Fisher linear discriminant figure
                            
                                Machine Learning Libraries For Android
                            
                                Training a RNN to output word2vec embedding instead of logits
                            
                                Understanding Keras prediction output of a rnn model in R
                            
                                Probabilistic Generation of Semantic Networks
                            
                                How to propagate/fire recurrent neural networks(RNN)?
                            
                                How to deal with feature vector of variable length?
                            
                                Serve trained Tensorflow model with REST API using Flask?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why word2vec doesn't use regularization?

Tags:

machine-learning

nlp

embedding

word2vec

regularized

Tural Gurbanov

People also ask

1 Answers

Maxim

Recent Activity

Donate For Us