Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get word vectors from Keras Embedding Layer

I'm currently working with a Keras model which has a embedding layer as first layer. In order to visualize the relationships and similarity of words between each other I need a function that returns the mapping of words and vectors of every element in the vocabulary (e.g. 'love' - [0.21, 0.56, ..., 0.65, 0.10]).

Is there any way to do it?

like image 574
philszalay Avatar asked Jul 08 '18 18:07

philszalay


People also ask

What is the output of keras embedding layer?

The output of the Embedding layer is a 2D vector with one embedding for each word in the input sequence of words (input document). If you wish to connect a Dense layer directly to an Embedding layer, you must first flatten the 2D output matrix to a 1D vector using the Flatten layer.

How the Word2vec embedding is obtained?

The context words are first passed as an input to an embedding layer (initialized with some random weights) as shown in the Figure below. The word embeddings are then passed to a lambda layer where we average out the word embeddings. We then pass these embeddings to a dense SoftMax layer that predicts our target word.

How do I get keras embed?

We can create a simple Keras model by just adding an embedding layer. In the above example, we are setting 10 as the vocabulary size, as we will be encoding numbers 0 to 9. We want the length of the word vector to be 4, hence output_dim is set to 4. The length of the input sequence to embedding layer will be 2.


1 Answers

You can get the word embeddings by using the get_weights() method of the embedding layer (i.e. essentially the weights of an embedding layer are the embedding vectors):

# if you have access to the embedding layer explicitly embeddings = emebdding_layer.get_weights()[0]  # or access the embedding layer through the constructed model  # first `0` refers to the position of embedding layer in the `model` embeddings = model.layers[0].get_weights()[0]  # `embeddings` has a shape of (num_vocab, embedding_dim)   # `word_to_index` is a mapping (i.e. dict) from words to their index, e.g. `love`: 69 words_embeddings = {w:embeddings[idx] for w, idx in word_to_index.items()}  # now you can use it like this for example print(words_embeddings['love'])  # possible output: [0.21, 0.56, ..., 0.65, 0.10] 
like image 144
today Avatar answered Oct 12 '22 08:10

today