Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convolutional neural network, how the second conv layer works on the first pooling layer

Tags:

I'm reading material from the TensorFlow website:

https://www.tensorflow.org/tutorials/layers

Suppose we have 10 greyscale monochrome 28x28 pixel images,

  1. If we apply 32 5x5 convolutional filters with 0 padding in the 1st conv layer, we get 10*32*28*28 data.
  2. If We apply 2x2 max pooling with stride 2 in the 1st pooling, we get 10*32*14*14 data.
  3. By now, one image has become a 14*14 size image with 32 channels.

So, if we apply a second convolutional layer(let's say 64 5x5 filters as in the link), do we apply these filters to each channel of each image and get 10*32*64*14*14 data?

like image 799
John Avatar asked Oct 09 '17 06:10

John


People also ask

How does the pooling layer work in a CNN?

Pooling layers are used to reduce the dimensions of the feature maps. Thus, it reduces the number of parameters to learn and the amount of computation performed in the network. The pooling layer summarises the features present in a region of the feature map generated by a convolution layer.

How do convolutional layers work in convolutional neural networks?

The first layer of a Convolutional Neural Network is always a Convolutional Layer. Convolutional layers apply a convolution operation to the input, passing the result to the next layer. A convolution converts all the pixels in its receptive field into a single value.

What is the difference between convolutional layer and pooling layer?

A conv-layer has parameters to learn (that is your weights which you update each step), whereas the pooling layer does not - it is just applying some given function e.g max-function. Save this answer.

Should a convolutional layer always be followed by a pooling layer?

In a convolutional neural network, a convolutional layer is usually followed by a pooling layer. Pooling layer is usually added to speed up computation and to make some of the detected features more robust.


2 Answers

Yes and No. You do apply the filters to each channel and each image, but you don't get 10*32*64*14*14 output dimensions. The dimensionality of the output is going to be 10*64*14*14, because the layer specified 64 output channels per image. In turn, the weights used for this convolution will have size 32*64*5*5 (64 5-by-5 filters for every channel on the input).

like image 134
kafman Avatar answered Oct 11 '22 13:10

kafman


No. If you convolve & pad (ignoring the batch size) a 14x14x32 volume with a set of 64 5x5 filters, you'll end up with a 14x14x64 output volume

Every single convolutional filter is convolved along the whole input depth. Thus, your 14x14x32 input volume is convolved with a 5x5 filter and then the output is a 14x14x1 feature map.

Then, the second 5x5 filter of the stack of 64 filters, is convolved again with the input volume. The same operation is done for each one of the 64 filters and the resulting feature maps are stacked, forming your output volume 14x14x64

like image 28
nessuno Avatar answered Oct 11 '22 13:10

nessuno