What does a weighted word embedding mean?

Tags:

In the paper that I am trying to implement, it says,

In this work, tweets were modeled using three types of text representation. The first one is a bag-of-words model weighted by tf-idf (term frequency - inverse document frequency) (Section 2.1.1). The second represents a sentence by averaging the word embeddings of all words (in the sentence) and the third represents a sentence by averaging the weighted word embeddings of all words, the weight of a word is given by tf-idf (Section 2.1.2).

I am not sure about the third representation which is mentioned as the weighted word embeddings which is using the weight of a word is given by tf-idf. I am not even sure if they can used together.

417

asked Dec 09 '17 09:12

Dawn17

2 Answers

Averaging (possibly weighted) of word embeddings makes sense, though depending on the main algorithm and the training data this sentence representation may not be the best. The intuition is the following:

You might want to handle sentences of different length, hence the averaging (better than plain sum).
Some words in a sentence are usually much more valuable than others. TF-IDF is the simplest measure of the word value. Note that the scale of the result doesn't change.

See also this paper by Kenter et al. There is a nice post that performs the comparison of these two approaches in different algorithms, and concludes that none is significantly better than the other: some algorithms favor simple averaging, some algorithms perform better with TF-IDF weighting.

157

answered Sep 20 '22 04:09

Maxim

In this article or this one, we use weighted sums, idf weighting and Part-of-speech weighting and a mixed method which use both. The mixed method is the best and help us to be first in the SemEval 2017 similarity task for english-spanish and for arabic-arabic (actually we were officially second for arabic because we did not send the mixed method for some reasons).

It is very easy to implement and to use, you have formula in the article but in a nutshell, the vector of a sentence is simply V = sum_i^k=1 Posweight(w_i) * IDFWeight(w_i) * V_i

answered Sep 20 '22 04:09

Didier Schwab

Related questions
                            
                                sklearn GridSearchCV not using sample_weight in score function
                            
                                How to iterate over layers in Pytorch
                            
                                Should the custom loss function in Keras return a single loss value for the batch or an arrary of losses for every sample in the training batch?
                            
                                Create Sparse Matrix from a data frame
                            
                                In Tensorflow, what is the difference between sampled_softmax_loss and softmax_cross_entropy_with_logits
                            
                                How do I set TensorFlow RNN state when state_is_tuple=True?
                            
                                R - XGBoost: Error building DMatrix
                            
                                Removing then Inserting a New Middle Layer in a Keras Model
                            
                                Keras Sequential model input layer
                            
                                What is the difference between these two ways of saving keras machine learning model weights?
                            
                                show feature names after feature selection
                            
                                How to sum leading diagonal of table in R
                            
                                Realistic time estimates for progress bars etc
                            
                                Machine Learning on server log data
                            
                                Does the dataset size influence a machine learning algorithm?
                            
                                What is rank in ALS machine Learning Algorithm in Apache Spark Mllib
                            
                                How to use OneHotEncoder for multiple columns and automatically drop first dummy variable for each column?
                            
                                NLP/Machine Learning text comparison [closed]
                            
                                Tensorflow: Where is tf.nn.conv2d Actually Executed?
                            
                                How to specify the correlation coefficient as the loss function in keras

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What does a weighted word embedding mean?

Tags:

machine-learning

nlp

word-embedding

word2vec

tf-idf

Dawn17

People also ask

2 Answers

Maxim

Didier Schwab

Recent Activity

Donate For Us