Dimensions in convolutional neural network

Question

I am trying to understand how the dimensions in convolutional neural network behave. In the figure below the input is 28-by-28 matrix with 1 channel. Then there are 32 5-by-5 filters (with stride 2 in height and width). So I understand that the result is 14-by-14-by-32. But then in the next convolutional layer we have 64 5-by-5 filters (again with stride 2). So why the result is 7-by-7- by 64 and not 7-by-7-by 32*64? Aren't we applying each one of the 64 filters to each one of the 32 channels?

enter image description here

Thomas Pinetz · Accepted Answer

One filter is the sum of all the dimensions in the previous layer. This means that the 5x5 filter sums up over all 32 dimensions and in essence is a weighted sum of 32*5*5 values. However the weight values are shared across dimensions. Then there are 64 such filters. A better explanation with images can be found here: http://cs231n.github.io/convolutional-networks/.

Lennart Scharmann · Answer

The depth is usually given implicitly. For example many Images are considered to have depth 3 (for the three color dimensions in each pixel). Then by a 5x5 filter we mean a 5x5x3 Filter. In your case the 5x5-Filter is really a 5x5x32 filter.

Depth one is usually explicitly stated (as in "5x5x1 filter").

Dimensions in convolutional neural network

Tags:

neural-network

deep-learning

conv-neural-network

convolution

Miriam Farber

Video Answer

2 Answers

Thomas Pinetz

Lennart Scharmann

Recent Activity

Donate For Us

Dimensions in convolutional neural network

Tags:

neural-network

deep-learning

conv-neural-network

convolution

Miriam Farber

Video Answer

2 Answers

Thomas Pinetz

Lennart Scharmann

Related questions

Recent Activity

Donate For Us