Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to understand the "Densely Connected Layer" section in tensorflow tutorial

Tags:

tensorflow

In the Densely Connected Layer section of the tensorflow tutorial, it says the image size is 7 x 7, after it is been processed. I tried the code, and it seem these parameters works.

But I do not know how to get this 7 x 7 dimension. I understand that:

  • the original image is 28 x 28,
  • in the 1st conv layer, the max_pool_2x2 function will reduce both of the image dimension by a factor of 4, so after the first pooling operation, the image size is 7 x 7
  • HERE'S WHAT I DO NOT UNDERSTAND

    in the 2nd conv layer, there another max_pool_2x2 function call, so I think the image size should be reduce by a factor of 4 again. But actually did not.

Which step I got wrong?

like image 351
David S. Avatar asked Jan 27 '16 10:01

David S.


People also ask

What is the size of densely connected layer image in TensorFlow?

In the Densely Connected Layer section of the tensorflow tutorial, it says the image size is 7 x 7, after it is been processed. I tried the code, and it seem these parameters works.

What is a dense layer in a neural network?

What is dense layer in neural network? A dense layer can be defined as: where W is weight, b is a bias, x is input and y is output, * is matrix multiply. In keras, we can use tf.keras.layers.Dense () to create a dense layer.

What does the structure of a dense layer look like?

The structure of a dense layer look like: Here the activation function is Relu. What is dense layer in neural network? A dense layer can be defined as: where W is weight, b is a bias, x is input and y is output, * is matrix multiply. In keras, we can use tf.keras.layers.Dense () to create a dense layer.

What is a dense layer in keras?

Keras - Dense Layer. Dense layer is the regular deeply connected neural network layer. It is most common and frequently used layer. Dense layer does the below operation on the input and return the output. dot represent numpy dot product of all input and its corresponding weights.


2 Answers

You also need to know the stride of the max pool and convolution.

def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')

Here, we can see that convolution has a stride of 1 and max pool has a stride of 2. How you can look at max pool, is that it takes a 2x2 box, and slides it over the image, each time taking the maximum value over 4 pixels. If you have a stride of 2, it takes 2 steps each time it moves! The image size should reduce by a factor of 2, instead of 4.

In other words, a 28x28 picture with max pool 2x2 and stride 2, will become 14x14. Another max pool 2x2 and stride 2 will reduce it to 7x7.

To further illustrate my point, let's take the case of max pool 2x2 and stride 1. If we don't pad the image, it will become a 27x27 image after max pool.

Here's an image for a more complete answer: enter image description here

like image 200
jkschin Avatar answered Oct 16 '22 22:10

jkschin


Take a look at Teach Yourself Deep Learning with TensorFlow and Udacity with Vincent Vanhoucke

This is covered in the course. I am currently working through it.

The course is free, however you do have to sign up. It is a series of videos, quizzes and coding projects all self paced and self graded. I am learning a lot and enjoy it.

Here is one of the quizzes.

enter image description here

like image 37
Guy Coder Avatar answered Oct 16 '22 21:10

Guy Coder