Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras Conv2D: filters vs kernel_size

Tags:

What's the difference between those two? It would also help to explain in the more general context of convolutional networks.

Also, as a side note, what is channels? In other words, please break down the 3 terms for me: channels vs filters vs kernel.

like image 482
Baron Yugovich Avatar asked Jul 04 '18 20:07

Baron Yugovich


People also ask

What is Kernel_size in Conv2D?

kernel_size: An integer or tuple/list of 2 integers, specifying the height and width of the 2D convolution window. Can be a single integer to specify the same value for all spatial dimensions. strides: An integer or tuple/list of 2 integers, specifying the strides of the convolution along the height and width.

What does filters mean in Conv2D?

A filter or a kernel in a conv2D layer has a height and a width. They are generally smaller than the input image and so we move them across the whole image. The area where the filter is on the image is called the receptive field.

What is the difference between conv1d and Conv2D?

conv1d is used when you slide your convolution kernels along 1 dimensions (i.e. you reuse the same weights, sliding them along 1 dimensions), whereas tf. layers. conv2d is used when you slide your convolution kernels along 2 dimensions (i.e. you reuse the same weights, sliding them along 2 dimensions).

Is kernel size same as filter?

They are the same! Filter or kernel is simple group of weights shared all over the input space.


1 Answers

Each convolution layer consists of several convolution channels (aka. depth or filters). In practice, they are a number such as 64, 128, 256, 512 etc. This is equal to number of channels in the output of a convolutional layer. kernel_size, on the other hand, is the size of these convolution filters. In practice, they take values such as 3x3 or 1x1 or 5x5. To abbreviate, they can be written as 1 or 3 or 5 as they are mostly square in practice.

Edit

Following quote should make it more clear.

Discussion on vlfeat

Suppose X is an input with size W x H x D x N (where N is the size of the batch) to a convolutional layer containing filter F (with size FW x FH x FD x K) in a network.

The number of feature channels D is the third dimension of the input X here (for example, this is typically 3 at the first input to the network if the input consists of colour images). The number of filters K is the fourth dimension of F. The two concepts are closely linked because if the number of filters in a layer is K, it produces an output with K feature channels. So the input to the next layer will have K feature channels.

The FW x FH above is filter size you are looking for.

Added

You should be familiar with filters. You can consider each filter to be responsible for extracting some type of feature from a raw image. The CNNs try to learn such filters i.e. the filters parametrized in CNNs are learned during training of CNNs. You apply each filter in a Conv2D to each input channel and combine these to get output channels. So, the number of filters and the number of output channels are the same.

like image 53
Dinesh Avatar answered Oct 13 '22 19:10

Dinesh