How can a sentence or a document be converted to a vector?

Tags:

We have models for converting words to vectors (for example the word2vec model). Do similar models exist which convert sentences/documents into vectors, using perhaps the vectors learnt for the individual words?

648

asked Jun 12 '15 05:06

Sahil

2 Answers

1) Skip gram method: paper here and the tool that uses it, google word2vec

2) Using LSTM-RNN to form semantic representations of sentences.

3) Representations of sentences and documents. The Paragraph vector is introduced in this paper. It is basically an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents.

4) Though this paper does not form sentence/paragraph vectors, it is simple enough to do that. One can just plug in the individual word vectors(Glove word vectors are found to give the best performance) and then can form a vector representation of the whole sentence/paragraph.

5) Using a CNN to summarize documents.

160

answered Oct 08 '22 17:10

Azrael

It all depends on:

which vector model you're using
what is the purpose of the model
your creativity in combining word vectors into a document vector

If you've generated the model using Word2Vec, you can either try:

Doc2Vec: https://radimrehurek.com/gensim/models/doc2vec.html
Wiki2Vec: https://github.com/idio/wiki2vec

Or you can do what some people do, i.e. sum all content words in the documents and divide by the content words, e.g. https://github.com/alvations/oque/blob/master/o.py#L13 (note: line 17-18 is a hack to reduce noise):

def sent_vectorizer(sent, model):     sent_vec = np.zeros(400)     numw = 0     for w in sent:         try:             sent_vec = np.add(sent_vec, model[w])             numw+=1         except:             pass     return sent_vec / np.sqrt(sent_vec.dot(sent_vec))

answered Oct 08 '22 17:10

alvas

Related questions
                            
                                What does std::vector look like in memory?
                            
                                How to initialize a vector with fixed length in R
                            
                                Memset on vector C++
                            
                                Multiply vector elements by a scalar value using STL
                            
                                How can I get the size of an std::vector as an int?
                            
                                How to change color of vector drawable path on button click
                            
                                array vs vector vs list
                            
                                C++ vector that *doesn't* initialize its members?
                            
                                Is a moved-from vector always empty?
                            
                                STL vectors with uninitialized storage?
                            
                                Does vector::erase() on a vector of object pointers destroy the object itself?
                            
                                What is the need for normalizing a vector?
                            
                                Check if two vectors are equal
                            
                                size vs capacity of a vector?
                            
                                Cleaning up an STL list/vector of pointers
                            
                                Why is std::vector so much more popular than std::deque? [duplicate]
                            
                                clearing a vector of pointers [duplicate]
                            
                                Is capacity copied in a vector?
                            
                                How is a vector's data aligned?
                            
                                How to find a value in a sorted C++ vector in the most efficient way?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How can a sentence or a document be converted to a vector?

Tags:

vector

nlp

word2vec

Sahil

People also ask

2 Answers

Azrael

alvas

Recent Activity

Donate For Us