Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calculate receptive field size? [closed]

Tags:

I'm reading paper about using CNN(Convolutional neural network) for object detection.

Rich feature hierarchies for accurate object detection and semantic segmentation

Here is a quote about receptive field:

The pool5 feature map is 6x6x256 = 9216 dimensional. Ignoring boundary effects, each pool5 unit has a receptive field of 195x195 pixels in the original 227x227 pixel input. A central pool5 unit has a nearly global view, while one near the edge has a smaller, clipped support. 

My questions are:

  1. What is definition of receptive field?
  2. How they compute size and location of receptive field?
  3. How we can compute bounding rect of receptive field using caffe/pycaffe?
like image 855
mrgloom Avatar asked Feb 23 '16 16:02

mrgloom


People also ask

How do you measure receptive field size?

1) It is the size of the area of pixels that impact the output of the last convolution. 3) You don't need to use a library to do it. For every 2x2 pooling the output size is reduced by half along each dimension. For strided convolutions, you also divide the size of each dimension by the stride.

What is the receptive field size?

Similarly, in a deep learning context, the Receptive Field (RF) is defined as the size of the region in the input that produces the feature[3]. Basically, it is a measure of association of an output feature (of any layer) to the input region (patch).

What is the receptive field size if each convolution layer is Stride 1?

If all strides are 1, then the receptive field will simply be the sum of (kl−1) ( k l − 1 ) over all layers, plus 1, which is simple to see.


1 Answers

1) It is the size of the area of pixels that impact the output of the last convolution.

2) For each convolution and pooling operation, compute the size of the output. Now find the input size that results in an output size of 1x1. Thats the size of the receptive field

3) You don't need to use a library to do it. For every 2x2 pooling the output size is reduced by half along each dimension. For strided convolutions, you also divide the size of each dimension by the stride. You may have to shave off some of the dimension depending on if you use padding for your convolutions. The simplest case is to use padding = floor(kernel size/2), so that a convolution dose not have any extra change on the output size.

like image 180
Raff.Edward Avatar answered Sep 29 '22 04:09

Raff.Edward