Why are inputs for convolutional neural networks always squared images?

Question

I have been doing deep learning with CNN for a while and I realize that the inputs for a model are always squared images.

I see that neither convolution operation or neural network architecture itself require such property.

So, what is the reason for that?

Yaroslav Bulatov · Accepted Answer

Because square images are pleasing to the eye. But there are applications on non-square images when domain requires it. For instance SVHN original dataset is an image of several digits, and hence rectangular images are used as input to convnet, as here

T Nguyen · Answer

From Suhas Pillai:

The problem is not with convolutional layers, it's the fully connected layers of the network ,which require fix number of neurons.For example, take a small 3 layer network + softmax layer. If first 2 layers are convolutional + max pooling, assuming the dimensions are same before and after convolution, and pooling reduces dim/2 ,which is usually the case. For an image of 3*32*32(C,W,H)with 4 filters in the first layer and 6 filters in the second layer ,the output after convolutional + max pooling at the end of 2nd layer, will be 6*8*8 ,whereas for an image with 3*64*64, at the end of 2nd layer output will be 6*16*16. Before doing fully connected,we stretch this as a single vector( 6*8*8=384 neurons)and do a fully connected operation. So, you cannot have different dimension fully connected layers for different size images. One way to tackle this is using spatial pyramid pooling, where you force the output of last convolutional layer to pool it to a fixed number of bins(I.e neurons) such that fully connected layer has same number of neurons. You can also check fully convolutional networks, which can take non-square images.

Martin Thoma · Answer

It is not necessary to have squared images. I see two "reasons" for it:

scaling: If images are scaled automatically from another aspect ratio (and landscape / portrait mode) this in average might introduce the least error
publications / visualizations: square images are easy to display together

Why are inputs for convolutional neural networks always squared images?

Tags:

artificial-intelligence

neural-network

deep-learning

T Nguyen

3 Answers

Yaroslav Bulatov

T Nguyen

Martin Thoma

Recent Activity

Donate For Us

Why are inputs for convolutional neural networks always squared images?

Tags:

artificial-intelligence

neural-network

deep-learning

T Nguyen

3 Answers

Yaroslav Bulatov

T Nguyen

Martin Thoma

Related questions

Recent Activity

Donate For Us