Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How To Determine the 'filter' Parameter in the Keras Conv2D Function

I'm just beginning my ML journey and have done a few tutorials. One thing that's not clear (to me) is how the 'filter' parameter is determined for Keras Conv2D.

Most sources I've read simply set the parameter to 32 without explanation. Is this just a rule of thumb or do the dimensions of the input images play a part? For example, the images in CIFAR-10 are 32x32

Specifically:

model = Sequential()
filters = 32
model.add(Conv2D(filters, (3, 3), padding='same', input_shape=x_train.shape[1:]))

model.add(Activation('relu'))
model.add(Conv2D(filters, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

The next layer has a filter parameter of filter*2 or 64. Again, how is this calculated?

Tx.

Joe

like image 705
Joe Avatar asked Jan 13 '18 19:01

Joe


People also ask

What is filter parameter in Conv2D?

filters. Mandatory Conv2D parameter is the numbers of filters that convolutional layers will learn from. It is an integer value and also determines the number of output filters in the convolution.

How do you calculate parameters in Conv2D?

Conv2D Layers By applying this formula to the first Conv2D layer (i.e., conv2d ), we can calculate the number of parameters using 32 * (1 * 3 * 3 + 1) = 320, which is consistent with the model summary. The input channel number is 1, because the input data shape is 28 x 28 x 1 and the number 1 is the input channel.

What is first parameter in Conv2D?

The first required Conv2D parameter is the number of filters that the convolutional layer will learn. Layers early in the network architecture (i.e., closer to the actual input image) learn fewer convolutional filters while layers deeper in the network (i.e., closer to the output predictions) will learn more filters.

How do we decide the filters in CNN?

Convolutional Neural Networks are (usually) supervised methods for image/object recognition. This means that you need to train the CNN using a set of labelled images: this allows to optimize the weights of its convolutional filters, hence learning the filters shape themselsves, to minimize the error.


2 Answers

Actually - there is no a good answer to your question. Most of the architectures are usually carefully designed and finetuned during many experiments. I could share with you some of the rules of thumbs one should apply when designing its own architecture:

  1. Avoid a dimension collapse in the first layer. Let's assume that your input filter has a (n, n) spatial shape for RGB image. In this case, it is a good practice to set the filter numbers to be greater than n * n * 3 as this is the dimensionality of the input of a single filter. If you set smaller number - you could suffer from the fact that many useful pieces of information about the image are lost due to initialization which dropped informative dimensions. Of course - this is not a general rule - e.g. for a texture recognition, where image complexity is lower - a small number of filters might actually help.

  2. Think more about volume than filters number - when setting the number of filters it's important to think about the volume change instead of the change of filter numbers between the consecutive layers. E.g. in VGG - even though the number of filters doubles after pooling layer - the actual feature map volume is decreased by a factor of 2, because of pooling decreasing the feature map by a factor of 4. Usually decreasing the size of the volume by more than 3 should be considered as a bad practice. Most of the modern architectures use the volume drop factor in the range between 1 and 2. Still - this is not a general rule - e.g. in case of a narrow hierarchy - the greater value of volume drop might actually help.

  3. Avoid bottlenecking. As one may read in this milestone paper bottlenecking might seriously harm your training process. It occurs when dropping the volume is too severe. Of course - this still might be achieved - but then you should use the intelligent downsampling, used e.g. in Inception v>2

  4. Check 1x1 convolutions - it's believed that filters activation are highly correlated. One may take advantage of it by using 1x1 convolutions - namely convolution with a filter size of 1. This makes possible e.g. volume dropping by them instead of pooling or intelligent downsampling (see example here). You could e.g. build twice more filters and then cut 25% of them by using 1x1 convs as a consecutive layer.

As you may see. There is no easy way to choose the number of filters. Except for the hints above, I'd like to share with you one of my favorite sanity checks on the number of filters. It takes 2 easy steps:

  1. Try to overfit at 500 random images with regularization.
  2. Try to overfit at the whole dataset without any regularization.

Usually - if the number of filters is too low (in general) - these two tests will show you that. If - during your training process - with regularization - your network severely overfits - this is a clear indicator that your network has way too many filters.

Cheers.

like image 142
Marcin Możejko Avatar answered Oct 20 '22 12:10

Marcin Możejko


Number of filters is chosen based complexity of task. More complex tasks require more filters. And usually number of filters grows after every layer (eg 128 -> 256 -> 512). First layers (with lower number of filters) catch few of some simple features of images (edges, color tone, etc) and next layers are trying to obtain more complex features based on simple ones.

There is nice course from Stanford to give you intuition and understanding of CNN.

like image 7
aiven Avatar answered Oct 20 '22 14:10

aiven