I have the following code that reads an image with opencv and displays it:
import cv2, matplotlib.pyplot as plt
img = cv2.imread('imgs_soccer/soccer_10.jpg',cv2.IMREAD_COLOR)
img = cv2.resize(img, (128, 128))
plt.imshow(img)
plt.show()
I want to generate some random images by using keras so I define this generator:
image_gen = ImageDataGenerator(rotation_range=15,
width_shift_range=0.1,
height_shift_range=0.1,
shear_range=0.01,
zoom_range=[0.9, 1.25],
horizontal_flip=True,
vertical_flip=False,
fill_mode='reflect',
data_format='channels_last',
brightness_range=[0.5, 1.5])
but, when I use it in this way:
image_gen.flow(img)
I get this error:
'Input data in `NumpyArrayIterator` should have rank 4. You passed an array with shape', (128, 128, 3))
And it seems obvious to me: RGB, an image, of course it is 3 dimension! What am I missing here? The documentation says that it wants a 4-dim array, but does not specify what should I put in the 4th dimension!
And how this 4-dim array should be made? I have, for now, (width, height, channel), this 4th dimension goes at the start or at the end?
I am also not very familiar with numpy: how can I alter the existing img array to add a 4th dimension?
Use np.expand_dims()
:
import numpy as np
img = np.expand_dims(img, 0)
print(img.shape) # (1, 128, 128, 3)
The first dimension specifies the number of images (in your case 1 image).
Alternatively, you can use numpy.newaxis
or None
for promoting your 3D array to 4D as in:
img = img[np.newaxis, ...]
# or use None
img = img[None, ...]
The first dimension is usually the batch_size
. This gives you lot of flexibility when you want to fully utilize modern hardwares such as GPUs as long as your tensor fits in your GPU memory. For example, you can pass 64 images by stacking 64 images along the first dimension. In this case, your 4D array would be of shape (64, width, height, channels)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With