Say we have a single channel image (5x5)
A = [ 1 2 3 4 5 6 7 8 9 2 1 4 5 6 3 4 5 6 7 4 3 4 5 6 2 ]
And a filter K (2x2)
K = [ 1 1 1 1 ]
An example of applying convolution (let us take the first 2x2 from A) would be
1*1 + 2*1 + 6*1 + 7*1 = 16
This is very straightforward. But let us introduce a depth factor to matrix A i.e., RGB image with 3 channels or even conv layers in a deep network (with depth = 512 maybe). How would the convolution operation be done with the same filter ? A similiar work out will be really helpful for an RGB case.
As you can see in the image, each channel is individually convoluted and then combined to form a pixel. This is how blurring operation works. In convolution, kernels weights for each channel are different and we add the 3 channels together to produce a single channels output.
The idea of convolution on volumes turns out to be really powerful. Only a small part of it is that you can now operate directly on RGB images with 3 channels, but even more important is that you can now detect 2 features like horizontal and vertical edges.
Convolution is a simple mathematical operation which is fundamental to many common image processing operators. Convolution provides a way of `multiplying together' two arrays of numbers, generally of different sizes, but of the same dimensionality, to produce a third array of numbers of the same dimensionality.
Lets say we have a 3 Channel (RGB) image given by some matrix A
A = [[[198 218 227] [196 216 225] [196 214 224] ... ... [185 201 217] [176 192 208] [162 178 194]]
and a blur kernal as
K = [[0.1111, 0.1111, 0.1111], [0.1111, 0.1111, 0.1111], [0.1111, 0.1111, 0.1111]] #which is actually 0.111 ~= 1/9
The convolution can be represented as shown in the image below
As you can see in the image, each channel is individually convoluted and then combined to form a pixel.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With