I'm reading material from the TensorFlow website: https://www.tensorflow.org/tutorials/layers Suppose we have 10 greyscale monochrome 28x28 pixel images, <ol> <li>If we apply 32 5x5 convolutional filters with 0 padding in the 1st conv layer, we get 10*32*28*28 data.</li> <li>If We apply 2x2 max pooling with stride 2 in the 1st pooling, we get 10*32*14*14 data.</li> <li>By now, one image has become a 14*14 size image with 32 channels. </li> </ol> So, if we apply a second convolutional layer(let's say 64 5x5 filters as in the link), do we apply these filters to each channel of each image and get 10*32*64*14*14 data?

Yes and No. You do apply the filters to each channel and each image, but you don't get <code>10*32*64*14*14</code> output dimensions. The dimensionality of the output is going to be <code>10*64*14*14</code>, because the layer specified 64 output channels per image. In turn, the weights used for this convolution will have size <code>32*64*5*5</code> (64 5-by-5 filters for every channel on the input).

No. If you convolve & pad (ignoring the batch size) a <code>14x14x32</code> volume with a set of 64 <code>5x5</code> filters, you'll end up with a <code>14x14x64</code> output volume Every single convolutional filter is convolved along the whole input depth. Thus, your <code>14x14x32</code> input volume is convolved with a <code>5x5</code> filter and then the output is a <code>14x14x1</code> feature map. Then, the second <code>5x5</code> filter of the stack of 64 filters, is convolved again with the input volume. The same operation is done for each one of the 64 filters and the resulting feature maps are stacked, forming your output volume <code>14x14x64</code>

Convolutional neural network, how the second conv layer works on the first pooling layer

2 Answers

Yes and No. You do apply the filters to each channel and each image, but you don't get 10*32*64*14*14 output dimensions. The dimensionality of the output is going to be 10*64*14*14, because the layer specified 64 output channels per image. In turn, the weights used for this convolution will have size 32*64*5*5 (64 5-by-5 filters for every channel on the input).

134

answered Oct 11 '22 13:10

kafman

No. If you convolve & pad (ignoring the batch size) a 14x14x32 volume with a set of 64 5x5 filters, you'll end up with a 14x14x64 output volume

Every single convolutional filter is convolved along the whole input depth. Thus, your 14x14x32 input volume is convolved with a 5x5 filter and then the output is a 14x14x1 feature map.

Then, the second 5x5 filter of the stack of 64 filters, is convolved again with the input volume. The same operation is done for each one of the 64 filters and the resulting feature maps are stacked, forming your output volume 14x14x64

answered Oct 11 '22 13:10

nessuno

Related questions
                            
                                Disable Test with Spring and Junit5 without context creation
                            
                                JetBrains IDE embedded terminal cursor disappears while using arrow keys (<- and ->)
                            
                                What is the difference between thrust::host_vector and std::vector?
                            
                                RxJava2 zip two Flowables into one
                            
                                Android - What's the difference between Locale.toString() and Locale.toLanguageTag()?
                            
                                Is there any disadvantage to putting ES6 import statements at the bottom of code files?
                            
                                Convert between MATLAB stereoParameters and OpenCV stereoRectify stereo calibration
                            
                                How can I use a non-public function from within a macro?
                            
                                Intermittent timeouts between AWS Lambda and RDS
                            
                                EarlyStopping callback behaving mysteriously in Keras
                            
                                When to use sceneDidLoad v didMove(to view:)
                            
                                Cannot read coredata from extension

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Convolutional neural network, how the second conv layer works on the first pooling layer

Tags:

John

People also ask

2 Answers

kafman

nessuno

Recent Activity

Donate For Us