word2vec - what is best? add, concatenate or average word vectors?

Tags:

I am working on a recurrent language model. To learn word embeddings that can be used to initialize my language model, I am using gensim's word2vec model. After training, the word2vec model holds two vectors for each word in the vocabulary: the word embedding (rows of input/hidden matrix) and the context embedding (columns of hidden/output matrix).

As outlined in this post there are at least three common ways to combine these two embedding vectors:

summing the context and word vector for each word
summing & averaging
concatenating the context and word vector

However, I couldn't find proper papers or reports on the best strategy. So my questions are:

Is there a common solution whether to sum, average or concatenate the vectors?
Or does the best way depend entirely on the task in question? If so, what strategy is best for a word-level language model?
Why combine the vectors at all? Why not use the "original" word embeddings for each word, i.e. those contained in the weight matrix between input and hidden neurons.

Related (but unanswered) questions:

word2vec: Summing/concatenate inside and outside vector
why we use input-hidden weight matrix to be the word vectors instead of hidden-output weight matrix?

377

asked Oct 23 '17 12:10

Lemon

2 Answers

I have found an answer in the Stanford lecture "Deep Learning for Natural Language Processing" (Lecture 2, March 2016). It's available here. In minute 46 Richard Socher states that the common way is to average the two word vectors.

181

answered Sep 25 '22 01:09

Lemon

You should read this research work at-least once to get the whole idea of combining word embeddings using different algebraic operators. It was my research.

In this paper you can also see the other methods to combine word vectors.

In short L1-Normalized average word vectors and sum of words are good representations.

answered Sep 24 '22 01:09

Nomiluks

Related questions
                            
                                Packaging Python applications with configuration files
                            
                                What is the subprocess.Popen max length of the args parameter?
                            
                                matplotlib: how to refresh figure.canvas
                            
                                Most elegant approach for writing JSON data to a relational database using Django Models?
                            
                                What are the advantages of concurrent.futures over multiprocessing in Python?
                            
                                Why aren't destructors guaranteed to be called on interpreter exit?
                            
                                What option do I need in setup.py to create the package in the right directory?
                            
                                Removing axes margins in 3D plot
                            
                                Python 3 sorting: Custom comparer removed in favor of key - why?
                            
                                How to align the bar and line in matplotlib two y-axes chart?
                            
                                how to use Flask Jinja2 url_for with multiple parameters
                            
                                Returning two values from pandas.rolling_apply
                            
                                What is the difference between scipy.integrate.odeint and scipy.integrate.ode?
                            
                                Plotly: Grouped Bar Chart with multiple axes
                            
                                Python: insert into list faster than O(N)?
                            
                                what is the IP address of my heroku application
                            
                                flask-sqlalchemy: AttributeError: type object has no attribute 'query', works in ipython
                            
                                Tensorflow `set_random_seed` not working [duplicate]
                            
                                Writing cross-compatible python2/python3 code in pycharm
                            
                                Python pandas linear regression groupby

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

word2vec - what is best? add, concatenate or average word vectors?

Tags:

python

word-embedding

gensim

word2vec

language-model

Lemon

People also ask

2 Answers

Lemon

Nomiluks

Recent Activity

Donate For Us