Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is the difference between Flatten() and GlobalAveragePooling2D() in keras

I want to pass the output of ConvLSTM and Conv2D to a Dense Layer in Keras, what is the difference between using global average pooling and flatten Both is working in my case.

model.add(ConvLSTM2D(filters=256,kernel_size=(3,3)))
model.add(Flatten())
# or model.add(GlobalAveragePooling2D())
model.add(Dense(256,activation='relu'))
like image 569
user239457 Avatar asked Mar 15 '18 09:03

user239457


People also ask

What is the use of flatten in Keras?

Flatten classFlattens the input. Does not affect the batch size. Note: If inputs are shaped (batch,) without a feature axis, then flattening adds an extra channel dimension and output shape is (batch, 1) . data_format: A string, one of channels_last (default) or channels_first .

What is flatten and dense layer?

Flatten layers are used when you got a multidimensional output and you want to make it linear to pass it onto a Dense layer. If you are familiar with numpy , it is equivalent to numpy. ravel . An output from flatten layers is passed to an MLP for classification or regression task you want to achieve.

What is flatten layer in sequential model?

Flatten is used to flatten the input. For example, if flatten is applied to layer having input shape as (batch_size, 2,2), then the output shape of the layer will be (batch_size, 4) Flatten has one argument as follows keras.layers.Flatten(data_format = None)

What is the flatten layer?

In order to save the layered image in a single-layer graphics format such as TIFF or JPEG, the image is said to be "flattened." An Adobe PDF file is also flattened to remove a transparency layer when the document is rendered in a printer or in an application that does not support the additional layer.


3 Answers

That both seem to work doesn't mean they do the same.

Flatten will take a tensor of any shape and transform it into a one dimensional tensor (plus the samples dimension) but keeping all values in the tensor. For example a tensor (samples, 10, 20, 1) will be flattened to (samples, 10 * 20 * 1).

GlobalAveragePooling2D does something different. It applies average pooling on the spatial dimensions until each spatial dimension is one, and leaves other dimensions unchanged. In this case values are not kept as they are averaged. For example a tensor (samples, 10, 20, 1) would be output as (samples, 1, 1, 1), assuming the 2nd and 3rd dimensions were spatial (channels last).

like image 169
Dr. Snoopy Avatar answered Oct 16 '22 08:10

Dr. Snoopy


What a Flatten layer does

After convolutional operations, tf.keras.layers.Flatten will reshape a tensor into (n_samples, height*width*channels), for example turning (16, 28, 28, 3) into (16, 2352). Let's try it:

import tensorflow as tf

x = tf.random.uniform(shape=(100, 28, 28, 3), minval=0, maxval=256, dtype=tf.int32)

flat = tf.keras.layers.Flatten()

flat(x).shape
TensorShape([100, 2352])

What a GlobalAveragePooling layer does

After convolutional operations, tf.keras.layers.GlobalAveragePooling layer does is average all the values according to the last axis. This means that the resulting shape will be (n_samples, last_axis). For instance, if your last convolutional layer had 64 filters, it would turn (16, 7, 7, 64) into (16, 64). Let's make the test, after a few convolutional operations:

import tensorflow as tf

x = tf.cast(
    tf.random.uniform(shape=(16, 28, 28, 3), minval=0, maxval=256, dtype=tf.int32),
    tf.float32)


gap = tf.keras.layers.GlobalAveragePooling2D()

for i in range(5):
    conv = tf.keras.layers.Conv2D(64, 3)
    x = conv(x)
    print(x.shape)

print(gap(x).shape)
(16, 24, 24, 64)
(16, 22, 22, 64)
(16, 20, 20, 64)
(16, 18, 18, 64)
(16, 16, 16, 64)

(16, 64)

Which should you use?

The Flatten layer will always have at least as much parameters as the GlobalAveragePooling2D layer. If the final tensor shape before flattening is still large, for instance (16, 240, 240, 128), using Flatten will make an insane amount of parameters: 240*240*128 = 7,372,800. This huge number will be multiplied by the number of units in your next dense layer! At that moment, GlobalAveragePooling2D might be preferred in most cases. If you used MaxPooling2D and Conv2D so much that your tensor shape before flattening is like (16, 1, 1, 128), it won't make a difference. If you're overfitting, you might want to try GlobalAveragePooling2D.

like image 31
Nicolas Gervais Avatar answered Oct 16 '22 07:10

Nicolas Gervais


Flattening is No brainer and it simply converts a multi-dimensional object to one-dimensional by re-arranging the elements.

While GlobalAveragePooling is a methodology used for better representation of your vector. It can be 1D/2D/3D. It uses a parser window which moves across the object and pools the data by averaging it (GlobalAveragePooling) or picking max value (GlobalMaxPooling). Padding is essentially required to take the corner cases into the account.

Both are used for taking effect of sequencing into account in a simpler way.

like image 9
The Free Soul Avatar answered Oct 16 '22 08:10

The Free Soul