Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Receptive Fields on ConvNets (Receptive Field size confusion)

I was reading this from a paper: "Rather than using relatively large receptive fields in the first conv. layers we use very small 3 × 3 receptive fields throughout the whole net, which are convolved with the input at every pixel (with stride 1). It is easy to see that a stack of two 3 × 3 conv. layers (without spatial pooling in between) has an effective receptive field of 5 × 5; three such layers have a 7 × 7 effective receptive field."

How do they end up with a recpetive field of 7x7 ?

This is how i understand it: Suppose that we have one image that is 100x100.

1st layer: zero-pad the image and convole it with the 3x3 filter, output another 100x100 filtered image.

2nd layer: zero-pad the previous filtered image and convolve it with another 3x3 filter, output another 100x100 filtered image.

3nd layer: zero-pad the previous filtered image and convolve it with another 3x3 filter, output the final 100x100 filtered image.

What am I missing there ?

like image 527
Sprk Avatar asked May 10 '16 11:05

Sprk


People also ask

How does the size of the receptive field influence sensitivity?

Large receptive fields allow the cell to detect changes over a wider area, but lead to a less precise perception.

How does receptive field size change?

Receptive field size increases with eccentricity and along a visual hierarchy. Larger RFs in the periphery are associated with visual input integration, supporting gist processing. Small RFs in the fovea obtain high-resolution processing even in presence of clutter.

What is receptive field size determined by?

The receptive field size of neurons in primary visual cortex depends strongly on the stimulus contrast. The size can be more than two times larger when measured with low contrast stimuli than when measured with high contrast stimuli.

Why are there different sizes of receptive fields?

Density of mechanoreceptors can affect the size of the receptive field for each receptor. High density leads to smaller receptive fields. Density and receptive field size varies by location on the body. Regions like the hands and face have smaller receptive fields than regions like the back.


1 Answers

Here's one way to think of it. Consider the following small image, with each pixel numbered as such:

00 01 02 03 04 05 06
10 11 12 13 14 15 16
20 21 22 23 24 25 26
30 31 32 33 34 35 36
40 41 42 43 44 45 46
50 51 52 53 54 55 56
60 61 62 63 64 65 66

Now consider the pixel 33 at the center. With the first 3x3 convolution, the generated value at pixel 33 will incorporate the values of pixels 22, 23, 24, 32, 33, 34, 42, 43, and 44. But notice that each of those pixels will also incorporate their surrounding pixels' values as well.

With the next 3x3 convolution, pixel 33 will again incorporate the values of its surrounding pixels, but now, the value of those pixels incorporates their surrounding pixels from the original image. In effect, this means that the value of pixel 33 is governed by the values reaching out to a 5x5 "square of influence" you could say.

Each additional 3x3 convolution has the effect of stretching the effective receptive field by another pixel in each direction.

I hope that didn't just make it more confusing...

like image 50
Aenimated1 Avatar answered Oct 06 '22 01:10

Aenimated1