The Keras layer documentation specifies the input and output sizes for convolutional layers: https://keras.io/layers/convolutional/ Input shape: <code>(samples, channels, rows, cols)</code> Output shape: <code>(samples, filters, new_rows, new_cols)</code> And the kernel size is a spatial parameter, i.e. detemines only width and height. So an input with <code>c</code> channels will yield an output with <code>filters</code> channels regardless of the value of <code>c</code>. It must therefore apply 2D convolution with a spatial <code>height x width</code> filter and then aggregate the results somehow for each learned filter. What is this aggregation operator? is it a summation across channels? can I control it? I couldn't find any information on the Keras documentation. <ul> <li>Note that in TensorFlow the filters are specified in the depth channel as well: https://www.tensorflow.org/api_guides/python/nn#Convolution, So the depth operation is clear.</li> </ul> Thanks.

It might be confusing that it is called Conv2D layer (it was to me, which is why I came looking for this answer), because as Nilesh Birari commented: <blockquote> I guess you are missing it's 3D kernel [width, height, depth]. So the result is summation across channels. </blockquote> Perhaps the 2D stems from the fact that the kernel only slides along two dimensions, the third dimension is fixed and determined by the number of input channels (the input depth). For a more elaborate explanation, read https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/ I plucked an illustrative image from there: <img src="https://i.stack.imgur.com/BZHGo.png" alt="kernel depth">

Keras Conv2D and input channels

Tags:

python

keras

The Keras layer documentation specifies the input and output sizes for convolutional layers: https://keras.io/layers/convolutional/

Input shape: (samples, channels, rows, cols)

Output shape: (samples, filters, new_rows, new_cols)

And the kernel size is a spatial parameter, i.e. detemines only width and height.

So an input with c channels will yield an output with filters channels regardless of the value of c. It must therefore apply 2D convolution with a spatial height x width filter and then aggregate the results somehow for each learned filter.

What is this aggregation operator? is it a summation across channels? can I control it? I couldn't find any information on the Keras documentation.

Note that in TensorFlow the filters are specified in the depth channel as well: https://www.tensorflow.org/api_guides/python/nn#Convolution, So the depth operation is clear.

Thanks.

977

asked Apr 09 '17 11:04

yoki

1 Answers

It might be confusing that it is called Conv2D layer (it was to me, which is why I came looking for this answer), because as Nilesh Birari commented:

I guess you are missing it's 3D kernel [width, height, depth]. So the result is summation across channels.

Perhaps the 2D stems from the fact that the kernel only slides along two dimensions, the third dimension is fixed and determined by the number of input channels (the input depth).

For a more elaborate explanation, read https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/

I plucked an illustrative image from there:

kernel depth

answered Sep 24 '22 05:09

noio

Related questions
                            
                                startapp with manage.py to create app in another directory
                            
                                How do I pass template context information when using HttpResponseRedirect in Django?
                            
                                Dynamic Keyword Arguments in Python?
                            
                                How to install python-dateutil on Windows?
                            
                                Google Search from a Python App
                            
                                Select value from list of tuples where condition
                            
                                How can I get the object count for a model in Django's templates?
                            
                                TensorFlow: "Attempting to use uninitialized value" in variable initialization
                            
                                Tuple list from dict in Python [duplicate]
                            
                                What is the use of Python's basic optimizations mode? (python -O)
                            
                                String function to strip the last comma
                            
                                How to extract dictionary single key-value pair in variables
                            
                                How to get the union of two lists using list comprehension? [duplicate]
                            
                                Pytest monkeypatch isn't working on imported function
                            
                                unexpected results converting timezones in python
                            
                                what's the inverse of the quantile function on a pandas Series?
                            
                                Simple Subquery with OuterRef
                            
                                Escaping dollar sign in ipython notebook
                            
                                The view didn't return an HttpResponse object. It returned None instead
                            
                                How to remove all characters before a specific character in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With