Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why rotation-invariant neural networks are not used in winners of the popular competitions?

As known, modern most popular CNN (convolutional neural network): VGG/ResNet (FasterRCNN), SSD, Yolo, Yolo v2, DenseBox, DetectNet - are not rotate invariant: Are modern CNN (convolutional neural network) as DetectNet rotate invariant?

Also known, that there are several neural networks with rotate-invariance object detection:

  1. Rotation-Invariant Neoperceptron 2006 (PDF): https://www.researchgate.net/publication/224649475_Rotation-Invariant_Neoperceptron

  2. Learning rotation invariant convolutional filters for texture classification 2016 (PDF): https://arxiv.org/abs/1604.06720

  3. RIFD-CNN: Rotation-Invariant and Fisher Discriminative Convolutional Neural Networks for Object Detection 2016 (PDF): http://www.cv-foundation.org/openaccess/content_cvpr_2016/html/Cheng_RIFD-CNN_Rotation-Invariant_and_CVPR_2016_paper.html

  4. Encoded Invariance in Convolutional Neural Networks 2014 (PDF)

  5. Rotation-invariant convolutional neural networks for galaxy morphology prediction (PDF): https://arxiv.org/abs/1503.07077

  6. Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images 2016: http://ieeexplore.ieee.org/document/7560644/

We know, that in such image-detection competitions as: IMAGE-NET, MSCOCO, PASCAL VOC - used networks ensembles (simultaneously some neural networks). Or networks ensembles in single net such as ResNet (Residual Networks Behave Like Ensembles of Relatively Shallow Networks)

But are used rotation invariant network ensembles in winners like as MSRA, and if not, then why? Why in ensemble the additional rotation-invariant network does not add accuracy to detect certain objects such as aircraft objects - which images is done at a different angles of rotation?

It can be:

  • aircraft objects which are photographed from the ground enter image description here

  • or ground objects which are photographed from the air enter image description here

Why rotation-invariant neural networks are not used in winners of the popular object-detection competitions?

like image 518
Alex Avatar asked Dec 09 '16 22:12

Alex


People also ask

Are neural networks rotation invariant?

To answer this question, we first need to discriminate between the individual filters in the network along with the final trained network. Individual filters in a CNN are not invariant to changes in how an image is rotated.

Can convolutional neural networks be rotation invariant?

Deep convolutional neural networks accuracy is heavily impacted by rotations of the input data. In this paper, we propose a convolutional predictor that is invariant to rotations in the input. This architecture is capable of predicting the angular orientation without angle-annotated data.

What does it mean to be invariant under rotation?

In mathematics, a function defined on an inner product space is said to have rotational invariance if its value does not change when arbitrary rotations are applied to its argument.

What is invariant neural network?

An invariant neuron, then, is one that maintains a high response to its feature despite certain transformations of its input. For example, a face selective neuron might respond strongly whenever a face is present in the image; if it is invariant, it might continue to respond strongly even as the image rotates.


2 Answers

The recent progress in image recognition which was mainly made by changing the approach from a classic feature selection - shallow learning algorithm to no feature selection - deep learning algorithm wasn't only caused by mathematical properties of convolutional neural networks. Yes - of course their ability to capture the same information using smaller number of parameters was partially caused by their shift invariance property but the recent research has shown that this is not a key in understanding their success.

In my opinion the main reason behind this success was developing faster learning algorithms than more mathematically accurate ones and that's why less attention is put on developing another property invariant neural nets.

Of course - rotation invariance is not skipped at all. This is partially made by data augmentation where you put the slightly changed (e.g. rotated or rescaled) image to your dataset - with the same label. As we can read in this fantastic book these two approaches (more structure vs less structure + data augmentation) are more or less equivalent. (Chapter 5.5.3, titled: Invariances)

like image 90
Marcin Możejko Avatar answered Oct 03 '22 17:10

Marcin Możejko


I'm also wondering why the community or scholar didn't put much attention on ration invariant CNN as @Alex.

One possible cause, in my opinion, is that many scenarios don't need this property, especially for those popular competitions. Like Rob mentioned, some natural pictures are already taken in a unified horizontal (or vertical) way. For example, in face detection, many works will align the picture to ensure the people are standing on the earth before feeding to any CNN models. To be honest, this is the most cheap and efficient way for this particular task.

However, there does exist some scenarios in real life, needing rotation invariant property. So I come to another guess: this problem is not difficult from those experts (or researchers)' view. At least we can use data augmentation to obtain some rotate invariant.

Lastly, thanks so much for your summarization about the papers. I added one more paper Group Equivariant Convolutional Networks_icml2016_GCNN and its implementation on github by other people.

like image 20
Tan Cniao Avatar answered Oct 03 '22 18:10

Tan Cniao