Supposedly, Elmo is a word embedding. So if the input is a sentence or a sequence of words, the output should be a sequence of vectors. Apparently, this is not the case.
The code below uses keras and tensorflow_hub.
a = ['aaa bbbb cccc uuuu vvvv wrwr', 'ddd ee fffff ppppp']
a = np.array(a, dtype=object)[:, np.newaxis]
#a.shape==(2,1)
input_text = layers.Input(shape=(1,), dtype="string")
embedding = ElmoEmbeddingLayer()(input_text)
model = Model(inputs=[input_text], outputs=embedding)
model.summary()
The class ElmoEmbedding is from https://github.com/strongio/keras-elmo/blob/master/Elmo%20Keras.ipynb.
b = model.predict(a)
#b.shape == (2, 1024)
Apparently, the embedding assigns a 1024-dimensional vector to each sentence. This is confusing.
Thank you.
I think I've found the answer. It's in https://tfhub.dev/google/elmo/2.
The output dictionary contains:
word_emb: the character-based word representations with shape [batch_size, max_length, 512].
lstm_outputs1: the first LSTM hidden state with shape [batch_size, max_length, 1024].
lstm_outputs2: the second LSTM hidden state with shape [batch_size, max_length, 1024].
elmo: the weighted sum of the 3 layers, where the weights are trainable. This tensor has shape [batch_size, max_length, 1024]
default: a fixed mean-pooling of all contextualized word representations with shape [batch_size, 1024].
The 4th layer is the actual word embedding. The 5th one reduces sequence output by the 4th layer to a single vector, effectively turning the whole thing into a sentence embedding.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With