While going through the autoencoder tutorial in Keras blog, I saw that the author uses same padding in max pooling layers in Convolutional Autoencoder part, as shown below. <pre class="prettyprint"><code>x = MaxPooling2D((2, 2), padding='same')(x) </code></pre> Could someone explain the reason behind this? With max pooling, we want to reduce the height and width but why is same padding, which keeps height and width the same, used here? In addition, the result of this code halves the dimensions by 2, so the same padding doesn't seem to work.

From https://keras.io/layers/convolutional/ <blockquote> "same" results in padding the input such that the output has the same length as the original input. </blockquote> From https://keras.io/layers/pooling/ <blockquote> pool_size: integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal). (2, 2) will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions. </blockquote> So, first let's start by asking why use padding at all? In the convolutional kernel context it is important since we don't want to miss each pixel being at the "center" of the kernel. There could be important behavior at the edges/corners of the image that a kernel is looking for. So we pad around the edges for Conv2D and as a result it returns the same size output as the input. However, in the case of the MaxPooling2D layer we are padding for similar reasons, but the stride size is affected by your choice of pooling size. Since your pooling size is 2, your image will be halved each time you go through a pooling layer. <pre class="prettyprint"><code>input_img = Input(shape=(28, 28, 1)) # adapt this if using `channels_first` image data format x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img) x = MaxPooling2D((2, 2), padding='same')(x) x = Conv2D(8, (3, 3), activation='relu', padding='same')(x) x = MaxPooling2D((2, 2), padding='same')(x) x = Conv2D(8, (3, 3), activation='relu', padding='same')(x) encoded = MaxPooling2D((2, 2), padding='same')(x) # at this point the representation is (4, 4, 8) i.e. 128-dimensional </code></pre> So in the case of your tutorial example; your image dimensions will go from 28->14->7->4 with each arrow representing the pooling layer.

Why use same padding with max pooling?

Tags:

padding

keras

While going through the autoencoder tutorial in Keras blog, I saw that the author uses same padding in max pooling layers in Convolutional Autoencoder part, as shown below.

x = MaxPooling2D((2, 2), padding='same')(x)

Could someone explain the reason behind this? With max pooling, we want to reduce the height and width but why is same padding, which keeps height and width the same, used here?

In addition, the result of this code halves the dimensions by 2, so the same padding doesn't seem to work.

365

asked Jan 22 '19 21:01

cbncs

1 Answers

From https://keras.io/layers/convolutional/

"same" results in padding the input such that the output has the same length as the original input.

From https://keras.io/layers/pooling/

pool_size: integer or tuple of 2 integers, factors by which to downscale (vertical, horizontal). (2, 2) will halve the input in both spatial dimension. If only one integer is specified, the same window length will be used for both dimensions.

So, first let's start by asking why use padding at all? In the convolutional kernel context it is important since we don't want to miss each pixel being at the "center" of the kernel. There could be important behavior at the edges/corners of the image that a kernel is looking for. So we pad around the edges for Conv2D and as a result it returns the same size output as the input.

However, in the case of the MaxPooling2D layer we are padding for similar reasons, but the stride size is affected by your choice of pooling size. Since your pooling size is 2, your image will be halved each time you go through a pooling layer.

input_img = Input(shape=(28, 28, 1))  # adapt this if using `channels_first` image data format

x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# at this point the representation is (4, 4, 8) i.e. 128-dimensional

So in the case of your tutorial example; your image dimensions will go from 28->14->7->4 with each arrow representing the pooling layer.

answered Sep 24 '22 16:09

Chudbrochil

Related questions
                            
                                Siamese Network with LSTM for sentence similarity in Keras gives periodically the same result
                            
                                Why does get_weights return an empty list?
                            
                                Keras LSTM: dropout vs recurrent_dropout
                            
                                Meaning of batch_size in model.evaluate()
                            
                                Parallelizing keras models in R using doParallel
                            
                                ValueError: Unknown layer:name when loading a keras model
                            
                                Difference between DepthwiseConv2D and SeparableConv2D
                            
                                Keras predict() returns a better accuracy than evaluate()
                            
                                Run model in reverse in Keras
                            
                                Stream Output of Predictions in Keras
                            
                                Error when checking model target: expected dense_24 to have shape...but got array with shape... in Keras
                            
                                AttributeError:'Tensor' object has no attribute '_keras_history'
                            
                                Tensor indexing in custom loss function
                            
                                Unable to load and use multiple keras models
                            
                                list_local_device tensorflow does not detect gpu
                            
                                Add hand-crafted features to Keras sequential model
                            
                                How to use Keras Embedding layer when there are more than 1 text features
                            
                                Keras plot_model not showing the input layer appropriately
                            
                                Error in load a model saved by callbakcs.ModelCheckpoint() in Keras
                            
                                ImportError: cannot import name 'transpose_shape'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With