Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between 3D-tensor and 4D-tensor for images input of DL Keras framework

By convention an image tensor is always 3D : One dimension for its height, one for its width and a third one for its color channel. Its shape looks like (height, width, color).

For instance a batch of 128 color images of size 256x256 could be stored in a 4D-tensor of shape (128, 256, 256, 3). The color channel represents here RGB colors. Another example with batch of 128 grayscale images stored in a 4D-tensor of shape (128, 256, 256, 1). The color could be coded as 8-bit integers.

For the second example, the last dimension is a vector containing only one element. It is then possible to use a 3D-tensor of shape (128, 256, 256,) instead.

Here comes my question : I would like to know if there is a difference between using a 3D-tensor rather than a 4D-tensor as the training input of a deep-learning framework using keras.

EDIT : My input layer is a conv2D

like image 683
kabhel Avatar asked Mar 15 '19 12:03

kabhel


People also ask

What is a 4D tensor?

Rank-4 tensors (4D tensors) A rank-4 tensor is created by arranging several 3D tensors into a new array. It has 4 axes. Example 1: A batch of RGB images. A batch of RGB images: An example of a rank-4 tensor (Image by author) In this case, the four axes denote (samples, height, width, color_channels) .

What are 3D tensors?

A 3D Tensor (or rank 3 Tensor) is a cube. An array of arrays of arrays, like so: Everything after 3D becomes harder to conceptualize, but let's try.

What is input tensor?

A tensor is a vector or matrix of n-dimensions that represents all types of data. All values in a tensor hold identical data type with a known (or partially known) shape. The shape of the data is the dimensionality of the matrix or array. A tensor can be originated from the input data or the result of a computation.

What are tensors in Python?

Tensor can be defined as a data container. It can be thought of as a multi-dimensional array. Numpy np. array can be used to create tensor of different dimensions such as 1D, 2D, 3D etc. A vector is a 1D tensor, a matrix is a 2D tensor.


1 Answers

I you take a look at the Keras documentation of the conv2D layer here you will see that the shape of the input tensor must be 4D.

conv2D layer input shape
4D tensor with shape: (batch, channels, rows, cols) if data_format is "channels_first" or 4D tensor with shape: (batch, rows, cols, channels) if data_format is "channels_last".

So the 4th dimension of the shape is mandatory, even if it is only "1" as for a grayscaled image.
So in fact, it is not a matter of performance gain nor lack of simplicity, it's only the mandatory input argument's shape.
Hope it answers your question.

like image 176
Gabriel Cretin Avatar answered Sep 30 '22 16:09

Gabriel Cretin