Why is the convolutional filter flipped in convolutional neural networks?

Tags:

I don't understand why there is the need to flip filters when using convolutional neural networks.

According to the lasagne documentation,

flip_filters : bool (default: True)

Whether to flip the filters before sliding them over the input, performing a convolution (this is the default), or not to flip them and perform a correlation. Note that for some other convolutional layers in Lasagne, flipping incurs an overhead and is disabled by default – check the documentation when using learned weights from another layer.

What does that mean? I never read about flipping filters when convolving in any neural network book. Would someone clarify, please?

331

asked Jul 17 '17 19:07

KenobiBastila

2 Answers

The underlying reason for transposing a convolutional filter is the definition of the convolution operation - which is a result of signal processing. When performing the convolution, you want the kernel to be flipped with respect to the axis along which you're performing the convolution because if you don't, you end up computing a correlation of a signal with itself. It's a bit easier to understand if you think about applying a 1D convolution to a time series in which the function in question changes very sharply - you don't want your convolution to be skewed by, or correlated with, your signal.

This answer from the digital signal processing stack exchange site gives an excellent explanation that walks through the mathematics of why convolutional filters are defined to go in the reverse direction of the signal.

This page walks through a detailed example where the flip is done. This is a particular type of filter used for edge detection called a Sobel filter. It doesn't explain why the flip is done, but is nice because it gives you a worked-out example in 2D.

I mentioned that it is a bit easier to understand the why (as in, why is convolution defined this way) in the 1D case (the answer from the DSP SE site is really a great explanation); but this convention does apply to 2D and 3D as well (the Conv2DDNN anad Conv3DDNN layers both have the flip_filter option). Ultimately, however, because the convolutional filter weights are not something that the human programs, but rather are "learned" by the network, it is entirely arbitrary - unless you are loading weights from another network, in which case you must be consistent with the definition of convolution in that network. If convolution was defined correctly (i.e., according to convention), the filter will be flipped. If it was defined incorrectly (in the more "naive" and "lazy" way), it will not.

The broader field that convolutions are a part of is "linear systems theory" so searching for this term might turn up more about this, albeit outside the context of neural networks.

Note that the convolution/correlation distinction is also mentioned in the docstrings of the corrmm.py class in lasagne:

flip_filters : bool (default: False) Whether to flip the filters and perform a convolution, or not to flip them and perform a correlation. Flipping adds a bit of overhead, so it is disabled by default. In most cases this does not make a difference anyway because the filters are learnt. However, flip_filters should be set to True if weights are loaded into it that were learnt using a regular :class:lasagne.layers.Conv2DLayer, for example.

194

answered Sep 21 '22 08:09

charlesreid1

Firstly, since CNNs are trained from scratch instead of human-designed, if the flip operation is necessary, the learned filters would be the flipped one and the cross-correlation with the flipped filters is implemented. Secondly, flipping is neccessary in 1D time-series processing, since the past inputs impact the current system output given the "current" input. But in 2D/3D image spatial convolution, there is not "time" concept, then not "past" input and its impact on "now", therefore, we don't need to consider the relationship of "signal" and "system", and there is only the relationship of "signal"(image patch) and "signal"(image patch), which means we only need cross-correlation instead of convolution (although DL borrow this concept from signal processing). Therefore, the flip operation is actually not needed. (I guess.)

answered Sep 18 '22 08:09

WBR

Related questions
                            
                                TensorFlow or Theano: how do they know the loss function derivative based on the neural network graph?
                            
                                Keras: How to feed input directly into other hidden layers of the neural net than the first?
                            
                                How to set initial weights in MLPClassifier?
                            
                                Keras LSTM - why different results with "same" model & same weights?
                            
                                make pycaffe fatal error: 'Python.h' file not found
                            
                                Tensorflow reshape tensor
                            
                                Multi GPU architecture, gradient averaging - less accurate model?
                            
                                Why do we call the fully connected layers in CNN "the Top Layers"?
                            
                                What is the difference between MLP implementation from scratch and in PyTorch?
                            
                                Keras' Sequential vs Functional API for Multi-Task Learning Neural Network
                            
                                neural networks regression using pybrain
                            
                                How to store neural network knowledge data?
                            
                                Python/Keras/Theano wrong dimensions for Deep Autoencoder
                            
                                How to count the amount of layers in a CNN?
                            
                                How to interpret keras " predict_generator " output?
                            
                                Google Deep Dream art: how to pick a layer in a neural network and enhance it
                            
                                Generative adversarial networks tanh? [closed]
                            
                                How do you de-normalise?
                            
                                Intuition behind U-net vs FCN for semantic segmentation
                            
                                Should I normalize my features before throwing them into RNN?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is the convolutional filter flipped in convolutional neural networks?

Tags:

neural-network

conv-neural-network

convolution

theano

lasagne

KenobiBastila

People also ask

2 Answers

charlesreid1

WBR

Recent Activity

Donate For Us