Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why use same padding with max pooling?

Tags:

padding

keras

While going through the autoencoder tutorial in Keras blog, I saw that the author uses same padding in max pooling layers in Convolutional Autoencoder part, as shown below.

x = MaxPooling2D((2, 2), padding='same')(x)

Could someone explain the reason behind this? With max pooling, we want to reduce the height and width but why is same padding, which keeps height and width the same, used here?

In addition, the result of this code halves the dimensions by 2, so the same padding doesn't seem to work.

like image 365
cbncs Avatar asked Jan 22 '19 21:01

cbncs


People also ask

Why is padding the same?

The padding type is called SAME because the output size is the same as the input size(when stride=1). Using 'SAME' ensures that the filter is applied to all the elements of the input. Normally, padding is set to "SAME" while training the model. Output size is mathematically convenient for further computation.

Does max pooling have padding?

max_pool returns an output of size 2x1. Output dimensions are calculated using the above formulas. There is no padding with the VALID option. Max pooling starts by placing the 2x2 filter over the input at (0,0) and selecting the maximum input value from the overlapping region.

What does padding =' same mean in keras?

"same" results in padding with zeros evenly to the left/right or up/down of the input. When padding="same" and strides=1 , the output has the same size as the input.

What does same padding mean in CNN?

Same Padding: In this case, we add 'p' padding layers such that the output image has the same dimensions as the input image.


1 Answers

From https://keras.io/layers/convolutional/

"same" results in padding the input such that the output has the same length as the original input.

From https://keras.io/layers/pooling/

pool_size: integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal). (2, 2) will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions.

So, first let's start by asking why use padding at all? In the convolutional kernel context it is important since we don't want to miss each pixel being at the "center" of the kernel. There could be important behavior at the edges/corners of the image that a kernel is looking for. So we pad around the edges for Conv2D and as a result it returns the same size output as the input.

However, in the case of the MaxPooling2D layer we are padding for similar reasons, but the stride size is affected by your choice of pooling size. Since your pooling size is 2, your image will be halved each time you go through a pooling layer.

input_img = Input(shape=(28, 28, 1))  # adapt this if using `channels_first` image data format

x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# at this point the representation is (4, 4, 8) i.e. 128-dimensional

So in the case of your tutorial example; your image dimensions will go from 28->14->7->4 with each arrow representing the pooling layer.

like image 92
Chudbrochil Avatar answered Sep 24 '22 16:09

Chudbrochil