Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow weights for kernels of convolution for colored images?

At the moment I have some networks doing classification stuff with greyscaled images. I want to move on to colored (RGB) images.

In the CIFAR-10 tutorial of Tensorflow I got confused by the weights for the convolution kernels. The first convolution there looks like this:

kernel = _variable_with_weight_decay('weights', shape=[5, 5, 3, 64],
                                         stddev=1e-4, wd=0.0)
conv = tf.nn.conv2d(images, kernel, [1, 1, 1, 1], padding='SAME')

So it is a 5x5 convolution with an input of 3 (one for each color channel: the red, green and blue image information) and it is generating 64 feature maps.

However, the second convolution layer takes an input of 64 feature maps:

kernel = _variable_with_weight_decay('weights', shape=[5, 5, 64, 64],
                                         stddev=1e-4, wd=0.0)
conv = tf.nn.conv2d(norm1, kernel, [1, 1, 1, 1], padding='SAME')

...so, how does this process the color information? Does this mean the different color channels are somehow spread on the 64 feature maps of convolution layer 1?

I thought conv layer 1 produces 64 feature maps for each color channel, therefore ending up in 3 * 64 = 196 feature maps...but obviously I were wrong.

How is the color information mixed there in conv layer 1?

like image 452
daniel451 Avatar asked Feb 15 '16 22:02

daniel451


1 Answers

See equation 3 in description of CuDNN here

Basically for a single example (n), single row (p) and single column (q), the result of spatial convolution will be a weighted sum of 5x5x3 values. So each activation will contain information from all 3 colors.

like image 66
Yaroslav Bulatov Avatar answered Oct 06 '22 18:10

Yaroslav Bulatov