Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is a convolution calculated on an image with three (RGB) channels?

Say we have a single channel image (5x5)

A = [ 1 2 3 4 5       6 7 8 9 2       1 4 5 6 3       4 5 6 7 4       3 4 5 6 2 ] 

And a filter K (2x2)

K = [ 1 1       1 1 ] 

An example of applying convolution (let us take the first 2x2 from A) would be

1*1 + 2*1 + 6*1 + 7*1 = 16 

This is very straightforward. But let us introduce a depth factor to matrix A i.e., RGB image with 3 channels or even conv layers in a deep network (with depth = 512 maybe). How would the convolution operation be done with the same filter ? A similiar work out will be really helpful for an RGB case.

like image 819
Aragorn Avatar asked May 08 '16 03:05

Aragorn


People also ask

How does convolution work on RGB image?

As you can see in the image, each channel is individually convoluted and then combined to form a pixel. This is how blurring operation works. In convolution, kernels weights for each channel are different and we add the 3 channels together to produce a single channels output.

How many channels should be there for the filter applied for convolution on the RGB image?

The idea of convolution on volumes turns out to be really powerful. Only a small part of it is that you can now operate directly on RGB images with 3 channels, but even more important is that you can now detect 2 features like horizontal and vertical edges.

How does convolution work in image processing?

Convolution is a simple mathematical operation which is fundamental to many common image processing operators. Convolution provides a way of `multiplying together' two arrays of numbers, generally of different sizes, but of the same dimensionality, to produce a third array of numbers of the same dimensionality.


1 Answers

Lets say we have a 3 Channel (RGB) image given by some matrix A

      A = [[[198 218 227]           [196 216 225]           [196 214 224]           ...           ...           [185 201 217]           [176 192 208]           [162 178 194]]  

and a blur kernal as

      K = [[0.1111, 0.1111, 0.1111],          [0.1111, 0.1111, 0.1111],          [0.1111, 0.1111, 0.1111]]      #which is actually 0.111 ~= 1/9  

The convolution can be represented as shown in the image below convolution of RGB channel

As you can see in the image, each channel is individually convoluted and then combined to form a pixel.

like image 142
Muthukrishnan Avatar answered Nov 22 '22 19:11

Muthukrishnan