Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras Conv2D and input channels

Tags:

python

keras

The Keras layer documentation specifies the input and output sizes for convolutional layers: https://keras.io/layers/convolutional/

Input shape: (samples, channels, rows, cols)

Output shape: (samples, filters, new_rows, new_cols)

And the kernel size is a spatial parameter, i.e. detemines only width and height.

So an input with c channels will yield an output with filters channels regardless of the value of c. It must therefore apply 2D convolution with a spatial height x width filter and then aggregate the results somehow for each learned filter.

What is this aggregation operator? is it a summation across channels? can I control it? I couldn't find any information on the Keras documentation.

  • Note that in TensorFlow the filters are specified in the depth channel as well: https://www.tensorflow.org/api_guides/python/nn#Convolution, So the depth operation is clear.

Thanks.

like image 977
yoki Avatar asked Apr 09 '17 11:04

yoki


People also ask

What is in channel and out channel in Conv2D?

in_channels (int) — Number of channels in the input image. out_channels (int) — Number of channels produced by the convolution. kernel_size (int or tuple) — Size of the convolving kernel.

What is input shape in Conv2D?

The ordering of the dimensions in the inputs. channels_last corresponds to inputs with shape (batch_size, height, width, channels) while channels_first corresponds to inputs with shape (batch_size, channels, height, width) . It defaults to the image_data_format value found in your Keras config file at ~/.

How does keras Conv2D work?

Conv2D Class. Keras Conv2D is a 2D Convolution Layer, this layer creates a convolution kernel that is wind with layers input which helps produce a tensor of outputs.

What is the function of Conv2D?

Applies a 2D convolution over an input image composed of several input planes. This operator supports TensorFloat32.


1 Answers

It might be confusing that it is called Conv2D layer (it was to me, which is why I came looking for this answer), because as Nilesh Birari commented:

I guess you are missing it's 3D kernel [width, height, depth]. So the result is summation across channels.

Perhaps the 2D stems from the fact that the kernel only slides along two dimensions, the third dimension is fixed and determined by the number of input channels (the input depth).

For a more elaborate explanation, read https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/

I plucked an illustrative image from there:

kernel depth

like image 93
noio Avatar answered Sep 24 '22 05:09

noio