Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does GlobalAveragePooling1D do in keras?

In the embedding example here: https://www.tensorflow.org/text/guide/word_embeddings

result = embedding_layer(tf.constant([[0, 1, 2], [3, 4, 5]]))
result.shape
TensorShape([2, 3, 5])

Then it explains:

When given a batch of sequences as input, an embedding layer returns a 3D floating point tensor, of shape (samples, sequence_length, embedding_dimensionality). To convert from this sequence of variable length to a fixed representation there are a variety of standard approaches. You could use an RNN, Attention, or pooling layer before passing it to a Dense layer. This tutorial uses pooling because it's the simplest.

The GlobalAveragePooling1D layer returns a fixed-length output vector for each example by averaging over the sequence dimension. This allows the model to handle input of variable length, in the simplest way possible.

Then the code:

embedding_dim=16

model = Sequential([
  vectorize_layer,
  Embedding(vocab_size, embedding_dim, name="embedding"),
  GlobalAveragePooling1D(),
  Dense(16, activation='relu'),
  Dense(1)
])

The GlobalAveragePooling1D should calculate a single integer for each word's embedding of dimension = n. I don't understand this part:

This allows the model to handle input of variable length, in the simplest way possible.

Similarly:

To convert from this sequence of variable length to a fixed representation there are a variety of standard approaches.

In each embedding layer, input length is already fixed by the parameter 'input_length'. Truncation and padding are used to ensure the fixed length of the input. So what does it mean by saying GlobalAveragePooling1D is used to convert from this sequence of variable length to a fixed representation? What does the 'variable length' mean here?

like image 563
marlon Avatar asked May 26 '26 02:05

marlon


1 Answers

I'm studying ML myself, so it's just my take on understanding GlobalAveragePooling1D.

The key to understanding this example is the quote slightly above the passage you were quoting:

It can embed sequences of variable lengths. You could feed into the embedding layer above batches with shapes (32, 10) (batch of 32 sequences of length 10) or (64, 15) (batch of 64 sequences of length 15).

So within a single fit all sequences will have the same length. But then you can take the same model and fit it on other sequence lengths. Thanks to GlobalAveragePooling1 the lengths of the vectors that get to the Dense layer will be the same (in fact they will be equal to the embedding dimension).

This is how I understand the flow:

word preprocessing flow

like image 96
Roman Pashkovsky Avatar answered May 31 '26 12:05

Roman Pashkovsky



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!