I know this question has been asked several times before but I didn't find much on google except a few packages written by several authors. In any case is there any plan of including the roi pooling layer (officially) in tensorflow as it is a vital component for object detection and other tasks and not having access to it is a pain while using tensorflow.
Any comments or alternate implementation (if verified) are welcomed.
Region of interest pooling (also known as RoI pooling) is an operation widely used in object detection tasks using convolutional neural networks. For example, to detect multiple cars and pedestrians in a single image.
ROI pooling solves the problem of fixed image size requirement for object detection network. ROI pooling produces the fixed-size feature maps from non-uniform inputs by doing max-pooling on the inputs. The number of output channels is equal to the number of input channels for this layer.
Region of Interest Warping, or RoIWarp, is a form of RoIPool that is differentiable with respect to the box position. In practice, this takes the form of a RoIWarp layer followed by a standard Max Pooling layer. The RoIWarp layer crops a feature map region and warps it into a target size by interpolation.
I was able to find answer to my question with the paper above. You can use tf.image.crop_and_resize function to crop any part of the network and resize it. Similar to ROI pooling you can crop a bounding box (scale it down by the number of downsampling steps e.g. 32 in VGG16) and resize it to NxN (e.g. 7x7 in VGG16) which can then be fed to the Fully Connected layer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With