Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are modern CNN (convolutional neural network) as DetectNet rotate invariant?

As known nVidia DetectNet - CNN (convolutional neural network) for object detection is based on approach from Yolo/DenseBox: https://devblogs.nvidia.com/parallelforall/deep-learning-object-detection-digits/

DetectNet is an extension of the popular GoogLeNet network. The extensions are similar to approaches taken in the Yolo and DenseBox papers.

And as shown here, DetectNet can detects objects (cars) with any rotations: https://devblogs.nvidia.com/parallelforall/detectnet-deep-neural-network-object-detection-digits/

enter image description here

Are modern CNN (convolutional neural network) as DetectNet rotate invariant?

Can I train DetectNet on thousands different images with one the same rotation angle of object, to detect objects on any rotation angles?

enter image description here

And what about rotate invariant of: Yolo, Yolo v2, DenseBox on which based DetectNet?

like image 827
Alex Avatar asked Dec 03 '16 20:12

Alex


1 Answers

No

In classification problems, CNNs are not rotate invariant. You need to include in your training set images with every possible rotation.

You can train a CNN to classify images into predefined categories (if you want to detect several objects in a image as in your example you need to scan every place of a image with your classifier).

However, this is an object detection problem, not only a classification problem.

In object detection problems, you can use a sliding window approach, but it is extremely inefficient. Instead a simple CNN other architectures are the state of art. For example:

  • Faster RCNN: https://arxiv.org/pdf/1506.01497.pdf
  • YOLO NET: https://pjreddie.com/darknet/yolo/
  • SSD: https://arxiv.org/pdf/1512.02325.pdf

These architectures can detect the object anywhere in the image, but you also must include in the training set samples with different rotations (and the training set must be labelled using bounding boxes, that it is very time consuming).

like image 59
Rob Avatar answered Oct 03 '22 22:10

Rob