Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do some images have third dimension 3 while others have 4?

I don't have much knowledge of image processing. I am trying to implement a ConvNet. I downloaded some images as data set and made their height and width equal. Then I tried loading them into np.array by this code:

train_list = glob.glob('A:\Code\Machine 
Learning\CNN\ConvolutionalNN1\TrainImg\*.jpg')
X_train_orig = np.array([np.array(Image.open(file)) for file in train_list])

But it gave me error that cannot broadcast (420,310) to (420,310,3). Then I printed the shape of array, some were (420,310,3) others (410,320,4). Why is so? And how can I change that to fit it in array?

like image 727
Gameatro Avatar asked Aug 20 '18 02:08

Gameatro


People also ask

What is 3rd dimension in image?

A picture that has or appears to have height, width and depth is three-dimensional (or 3-D). A picture that has height and width but no depth is two-dimensional (or 2-D).

Why do we only see 3 dimensions?

New research has shown that of all the possible dimensional realities, only those of three or seven dimensions would survive in an expanding universe. We may have ended up being 3D because it was the most probable. In its basic form, string theory describes subatomic particles as bits of vibrating string.

How does the 3rd dimension work?

Third Dimension To put this in cartesian terms, the 2D square existed in the X and Y directions. Moving into the 3rd dimension extruded that square in the Z direction. The third dimension is where our cube actually becomes a cube in our traditional defined sense. The object has dimensions of width, length, and height.

What is a flat image that can be displayed in three dimensions?

An autostereogram is a single-image stereogram (SIS), designed to create the visual illusion of a three-dimensional (3D) scene from a two-dimensional image.


1 Answers

Problem

So basically what is happening over here is you are playing with three different formats of images (at least those that appear in your question). They are respectively:

  • RGB (of dimension (420, 310, 3)), three channels
  • RGB-A (of dimension (420, 310, 4)), four channels
  • Grayscale (of dimension (420, 310)), single channel

The third dimension that you are seeing is what represents the number of channels in your image (the first two being the height and width respectively).

An example will further clear it up. I downloaded random images from the internet each belonging to one of the three formats mentioned above.

RGB image dog.png

enter image description here

RGB-A image fish.png

enter image description here

Grayscale image lena.png

enter image description here

Here's a python script to load each of them using PIL and display their shape:

from PIL import Image
import numpy as np

dog = Image.open('dog.png')
print('Dog shape is ' + str(np.array(dog).shape))

fish = Image.open('fish.png')
print('Fish shape is ' + str(np.array(fish).shape))

lena = Image.open('lena.png')
print('Lena shape is ' + str(np.array(lena).shape))

And here is the output:

Dog shape is (250, 250, 3)
Fish shape is (501, 393, 4)
Lena shape is (512, 512)

Hence, when you are trying to iteratively assign all the images to an array (np.array), you are getting the shape mis-match error.

Solution

The easiest way to resolve this is to convert all the images to one particular format before saving it in the array. Assuming you will be using a pre-trained ImageNet model, we will convert them to RGB format (you can similarly choose a format of your choice also).

We will convert RGB-A to RGB using the following code:

fish = Image.open('fish.png')
print('Fish RGB-A shape is ' + str(np.array(fish).shape))
rgb = fish.convert('RGB')
print('Fish RGB shape is ' + str(np.array(rgb).shape))

Output is:

Fish RGB-A shape is (501, 393, 4)
Fish RGB shape is (501, 393, 3)

Similarly you can do for all your images, and then you have a consistent number of channels (three in this case) for all your images.

NOTE: In my example, the spatial dimensions vary for the images also. In your case that is not an issue as all are of consistent dimension (420, 310).

Hope this clarifies your doubt.

like image 166
Koustav Avatar answered Nov 10 '22 21:11

Koustav