Does Word2Vec has a hidden layer?

Tags:

word2vec

When I am reading one of papers of Tomas Mikolov: http://arxiv.org/pdf/1301.3781.pdf

I have one concern on the Continuous Bag-of-Words Model section：

The first proposed architecture is similar to the feedforward NNLM, where the non-linear hidden layer is removed and the projection layer is shared for all words (not just the projection matrix); thus, all words get projected into the same position (their vectors are averaged).

I find some people mention that there is a hidden layer in Word2Vec model, but from my understanding, there is only one projection layer in that model. Does this projection layer do the same work as hidden layer?

The another question is that how to project input data into the projection layer?

"the projection layer is shared for all words (not just the projection matrix)", what does that mean?

431

asked Oct 27 '15 16:10

Kun

1 Answers

From the original paper, section 3.1, it is clear that there is no hidden layer:

"the first proposed architecture is similar to the feedforward NNLM where the non-linear hidden layer is removed and the projection layer is shared for all words".

With respect to your second question (what does sharing the projection layer means), it means that you consider only one single vector, which is the centroid of the vectors of all the words in context. Thus, instead of having n-1 word vectors as input, you consider only one vector. This is why it is called Continuous Bag of Words (because word order is lost within the context of size n-1).

137

answered Oct 09 '22 02:10

Antoine

Related questions
                            
                                Help with Neuroph neural network
                            
                                Tensorflow GPU utilization only 60% (GTX 1070)
                            
                                Neural Network to predict nth square
                            
                                Getting Started with Neural Networks (ANN)?
                            
                                Is it possible to split a network across multiple GPUs in tensorflow?
                            
                                Implement K-fold cross validation in MLPClassification Python
                            
                                Printing out the validation accuracy to the console for every batch or epoch (Keras)
                            
                                How to measure overfitting when train and validation sample is small in Keras model
                            
                                Neural Network "Breeding"
                            
                                Java: micro-optimizing array manipulation
                            
                                what exactly does 'tf.contrib.rnn.DropoutWrapper'' in tensorflow do? ( three citical questions)
                            
                                List of activation functions in C#
                            
                                Proper way to implement biases in Neural Networks
                            
                                Wrap CNTK Applications
                            
                                Set half of the filters of a layer as not trainable keras/tensorflow
                            
                                Keras or Tensorflow function to draw a 3D diagram of a neural network structure?
                            
                                Can a neural network be used to find a functions minimum(a)?
                            
                                How to do supervised deepbelief training in PyBrain?
                            
                                Audio signal source separation with neural network
                            
                                number of parameters in Caffe LENET or Imagenet models

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Does Word2Vec has a hidden layer?

Tags:

neural-network

word2vec

Kun

People also ask

1 Answers

Antoine

Recent Activity

Donate For Us