In the Densely Connected Layer section of the tensorflow tutorial, it says the image size is 7 x 7, after it is been processed. I tried the code, and it seem these parameters works.
But I do not know how to get this 7 x 7 dimension. I understand that:
max_pool_2x2
function will reduce both of the image dimension by a factor of 4, so after the first pooling operation, the image size is 7 x 7
HERE'S WHAT I DO NOT UNDERSTAND
in the 2nd conv layer, there another max_pool_2x2
function call, so I think the image size should be reduce by a factor of 4 again. But actually did not.
Which step I got wrong?
In the Densely Connected Layer section of the tensorflow tutorial, it says the image size is 7 x 7, after it is been processed. I tried the code, and it seem these parameters works.
What is dense layer in neural network? A dense layer can be defined as: where W is weight, b is a bias, x is input and y is output, * is matrix multiply. In keras, we can use tf.keras.layers.Dense () to create a dense layer.
The structure of a dense layer look like: Here the activation function is Relu. What is dense layer in neural network? A dense layer can be defined as: where W is weight, b is a bias, x is input and y is output, * is matrix multiply. In keras, we can use tf.keras.layers.Dense () to create a dense layer.
Keras - Dense Layer. Dense layer is the regular deeply connected neural network layer. It is most common and frequently used layer. Dense layer does the below operation on the input and return the output. dot represent numpy dot product of all input and its corresponding weights.
You also need to know the stride of the max pool and convolution.
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
Here, we can see that convolution has a stride of 1 and max pool has a stride of 2. How you can look at max pool, is that it takes a 2x2 box, and slides it over the image, each time taking the maximum value over 4 pixels. If you have a stride of 2, it takes 2 steps each time it moves! The image size should reduce by a factor of 2, instead of 4.
In other words, a 28x28 picture with max pool 2x2 and stride 2, will become 14x14. Another max pool 2x2 and stride 2 will reduce it to 7x7.
To further illustrate my point, let's take the case of max pool 2x2 and stride 1. If we don't pad the image, it will become a 27x27 image after max pool.
Here's an image for a more complete answer:
Take a look at Teach Yourself Deep Learning with TensorFlow and Udacity with Vincent Vanhoucke
This is covered in the course. I am currently working through it.
The course is free, however you do have to sign up. It is a series of videos, quizzes and coding projects all self paced and self graded. I am learning a lot and enjoy it.
Here is one of the quizzes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With