I'm trying to understand model.summary()
in Keras. I have the following Convolutional Neural Network. The values of the first Convolution are:
conv2d_4 (Conv2D) (None, 148, 148, 16) 448
Where does 148 and 448 come from?
Code
image_input = layers.Input(shape=(150, 150, 3))
x = layers.Conv2D(16, 3, activation='relu')(image_input)
x = layers.MaxPooling2D(2)(x)
x = layers.Conv2D(32, 3, activation='relu')(x)
x = layers.MaxPooling2D(2)(x)
x = layers.Conv2D(64, 3, activation='relu')(x)
x = layers.MaxPooling2D(2)(x)
x = layers.Flatten()(x)
x = layers.Dense(512, activation='relu')(x)
output = layers.Dense(1, activation='sigmoid')(x)
# Keras Model definition
# input = input feature map
# output = input feature map + stacked convolution/maxpooling layers + fully connected layer + sigmoid output layer
model = Model(image_input, output)
model.summary()
Output
Layer (type) Output Shape Param #
=================================================================
input_2 (InputLayer) (None, 150, 150, 3) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 148, 148, 16) 448
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 74, 74, 16) 0
_________________________________________________________________
conv2d_5 (Conv2D) (None, 72, 72, 32) 4640
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 36, 36, 32) 0
_________________________________________________________________
conv2d_6 (Conv2D) (None, 34, 34, 64) 18496
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 17, 17, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 18496) 0
_________________________________________________________________
dense_1 (Dense) (None, 512) 9470464
_________________________________________________________________
dense_2 (Dense) (None, 1) 513
From the Keras documentation, you can see that padding is by default=valid
so that there is no padding and that the strides size is 1. Then your output shape is obviously 148 x 148.
To calculate this you could use this formula:
O = (W - K + 2P)/S + 1
where O is the output height/width, W is the input height/width, K is the filter size, P is the padding and S is the stride size.
Concerning the second parameter, you have a feature map of 16 and your kernel size is 3 x 3 so that you have 16 x (3 x 3) which is 144. Then you have three color channels so that 144 x 3 = 432 and then you need to add 16 biases which makes 448;) Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With