Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ImageDataGenerator: how to add the 4th dimension to a numpy array?

I have the following code that reads an image with opencv and displays it:

import cv2, matplotlib.pyplot as plt
img = cv2.imread('imgs_soccer/soccer_10.jpg',cv2.IMREAD_COLOR)
img = cv2.resize(img, (128, 128))
plt.imshow(img)
plt.show()

I want to generate some random images by using keras so I define this generator:

image_gen = ImageDataGenerator(rotation_range=15,
                           width_shift_range=0.1,
                           height_shift_range=0.1,
                           shear_range=0.01,
                           zoom_range=[0.9, 1.25],
                           horizontal_flip=True,
                           vertical_flip=False,
                           fill_mode='reflect',
                           data_format='channels_last',
                           brightness_range=[0.5, 1.5])

but, when I use it in this way:

image_gen.flow(img)

I get this error:

'Input data in `NumpyArrayIterator` should have rank 4. You passed an array with shape', (128, 128, 3))

And it seems obvious to me: RGB, an image, of course it is 3 dimension! What am I missing here? The documentation says that it wants a 4-dim array, but does not specify what should I put in the 4th dimension!

And how this 4-dim array should be made? I have, for now, (width, height, channel), this 4th dimension goes at the start or at the end?

I am also not very familiar with numpy: how can I alter the existing img array to add a 4th dimension?

like image 606
Phate Avatar asked Apr 24 '19 16:04

Phate


2 Answers

Use np.expand_dims():

import numpy as np
img = np.expand_dims(img, 0)
print(img.shape) # (1, 128, 128, 3)

The first dimension specifies the number of images (in your case 1 image).

like image 82
Vlad Avatar answered Oct 10 '22 20:10

Vlad


Alternatively, you can use numpy.newaxis or None for promoting your 3D array to 4D as in:

img = img[np.newaxis, ...] 

# or use None
img = img[None, ...]

The first dimension is usually the batch_size. This gives you lot of flexibility when you want to fully utilize modern hardwares such as GPUs as long as your tensor fits in your GPU memory. For example, you can pass 64 images by stacking 64 images along the first dimension. In this case, your 4D array would be of shape (64, width, height, channels).

like image 26
kmario23 Avatar answered Oct 10 '22 21:10

kmario23