Is Elmo a word embedding or a sentence embedding?

Question

Supposedly, Elmo is a word embedding. So if the input is a sentence or a sequence of words, the output should be a sequence of vectors. Apparently, this is not the case.

The code below uses keras and tensorflow_hub.

a = ['aaa bbbb cccc uuuu vvvv wrwr', 'ddd ee fffff ppppp']
a = np.array(a, dtype=object)[:, np.newaxis]
#a.shape==(2,1)

input_text = layers.Input(shape=(1,), dtype="string")
embedding = ElmoEmbeddingLayer()(input_text)
model = Model(inputs=[input_text], outputs=embedding)

model.summary()

The class ElmoEmbedding is from https://github.com/strongio/keras-elmo/blob/master/Elmo%20Keras.ipynb.

b = model.predict(a)
#b.shape == (2, 1024)

Apparently, the embedding assigns a 1024-dimensional vector to each sentence. This is confusing.

Thank you.

Myath · Accepted Answer

I think I've found the answer. It's in https://tfhub.dev/google/elmo/2.

The output dictionary contains:

word_emb: the character-based word representations with shape [batch_size, max_length, 512].
lstm_outputs1: the first LSTM hidden state with shape [batch_size, max_length, 1024].
lstm_outputs2: the second LSTM hidden state with shape [batch_size, max_length, 1024].
elmo: the weighted sum of the 3 layers, where the weights are trainable. This tensor has shape [batch_size, max_length, 1024]
default: a fixed mean-pooling of all contextualized word representations with shape [batch_size, 1024].

The 4th layer is the actual word embedding. The 5th one reduces sequence output by the 4th layer to a single vector, effectively turning the whole thing into a sentence embedding.

Is Elmo a word embedding or a sentence embedding?

Tags:

python

tensorflow

nlp

keras

Myath

1 Answers

Myath

Recent Activity

Donate For Us

Is Elmo a word embedding or a sentence embedding?

Tags:

python

tensorflow

nlp

keras

Myath

1 Answers

Myath

Related questions

Recent Activity

Donate For Us