Why do some images have third dimension 3 while others have 4?

Tags:

I don't have much knowledge of image processing. I am trying to implement a ConvNet. I downloaded some images as data set and made their height and width equal. Then I tried loading them into np.array by this code:

train_list = glob.glob('A:\Code\Machine 
Learning\CNN\ConvolutionalNN1\TrainImg\*.jpg')
X_train_orig = np.array([np.array(Image.open(file)) for file in train_list])

But it gave me error that cannot broadcast (420,310) to (420,310,3). Then I printed the shape of array, some were (420,310,3) others (410,320,4). Why is so? And how can I change that to fit it in array?

727

asked Aug 20 '18 02:08

Gameatro

1 Answers

Problem

So basically what is happening over here is you are playing with three different formats of images (at least those that appear in your question). They are respectively:

RGB (of dimension (420, 310, 3)), three channels
RGB-A (of dimension (420, 310, 4)), four channels
Grayscale (of dimension (420, 310)), single channel

The third dimension that you are seeing is what represents the number of channels in your image (the first two being the height and width respectively).

An example will further clear it up. I downloaded random images from the internet each belonging to one of the three formats mentioned above.

RGB image dog.png

enter image description here

RGB-A image fish.png

enter image description here

Grayscale image lena.png

enter image description here

Here's a python script to load each of them using PIL and display their shape:

from PIL import Image
import numpy as np

dog = Image.open('dog.png')
print('Dog shape is ' + str(np.array(dog).shape))

fish = Image.open('fish.png')
print('Fish shape is ' + str(np.array(fish).shape))

lena = Image.open('lena.png')
print('Lena shape is ' + str(np.array(lena).shape))

And here is the output:

Dog shape is (250, 250, 3)
Fish shape is (501, 393, 4)
Lena shape is (512, 512)

Hence, when you are trying to iteratively assign all the images to an array (np.array), you are getting the shape mis-match error.

Solution

The easiest way to resolve this is to convert all the images to one particular format before saving it in the array. Assuming you will be using a pre-trained ImageNet model, we will convert them to RGB format (you can similarly choose a format of your choice also).

We will convert RGB-A to RGB using the following code:

fish = Image.open('fish.png')
print('Fish RGB-A shape is ' + str(np.array(fish).shape))
rgb = fish.convert('RGB')
print('Fish RGB shape is ' + str(np.array(rgb).shape))

Output is:

Fish RGB-A shape is (501, 393, 4)
Fish RGB shape is (501, 393, 3)

Similarly you can do for all your images, and then you have a consistent number of channels (three in this case) for all your images.

NOTE: In my example, the spatial dimensions vary for the images also. In your case that is not an issue as all are of consistent dimension (420, 310).

Hope this clarifies your doubt.

166

answered Nov 10 '22 21:11

Koustav

Related questions
                            
                                how to filter pandas dataframe by string?
                            
                                Infinite loops using 'for' in Python [duplicate]
                            
                                Shift letters by a certain value in python
                            
                                Why is sys.exit() causing a traceback?
                            
                                Missing 1 Required Keyword-Only Argument
                            
                                Images dimensions error in python
                            
                                ZeroMQ operation throws EXC: [ Operation cannot be accomplished in current state ]
                            
                                pandas dataframe delete rows with low frequency
                            
                                Replace 1's with 0's in a sequence
                            
                                'JpegImageFile' object has no attribute '_committed' error when using PIL
                            
                                why is 'ord' seen as an unassigned variable here?
                            
                                Shrink polygon using corner coordinates
                            
                                AttributeError: 'YouTube' object has no attribute 'get_videos'
                            
                                How to vertically stack trained models in keras?
                            
                                Pandas read data without header or index
                            
                                Python: determine if three text strings stored in a dataframe have any words in common
                            
                                How can I zoom my webcam in Open CV Python?
                            
                                Swapping elements between A and B to get sums equality
                            
                                Plotting Multiple Routes with OSMNx
                            
                                Failed building wheel for Twisted in Windows 10 python 3

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why do some images have third dimension 3 while others have 4?

Tags:

python

image-processing

conv-neural-network

Gameatro

People also ask

1 Answers

Problem

Solution

Koustav

Recent Activity

Donate For Us