Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras ImageDataGenerator() how to get all labels from data

Tags:

python

keras

I am using the ImageDataGenerator() in Keras and I would like to get the labels of my entire test data.

Currently I am using the following code to accomplish this task:

test_batches = ImageDataGenerator().flow_from_directory(...)

test_labels = []

for i in range(0,3):
    test_labels.extend(np.array(test_batches[i][1]))

This code however only works because I know I have a total of 150 images and my batch size is defined to be 50.

Moreover using:

imgs, labels = next(test_batches)

as suggested in similar posts on this topic only returns labels for one batch and not the entire dataset. As such I wonder if there is a more efficient way of doing this than the method I am using above.

like image 346
AaronDT Avatar asked Jan 22 '18 01:01

AaronDT


People also ask

What does ImageDataGenerator return?

ImageDataGenerator class ensures that the model receives new variations of the images at each epoch. But it only returns the transformed images and does not add it to the original corpus of images.

How does keras ImageDataGenerator work?

Introduction to Keras ImageDataGenerator. Keras ImageDataGenerator is used for getting the input of the original data and further, it makes the transformation of this data on a random basis and gives the output resultant containing only the data that is newly transformed.

How many images are generate by ImageDataGenerator?

Then the "ImageDataGenerator" will produce 10 images in each iteration of the training. An iteration is defined as steps per epoch i.e. the total number of samples / batch_size. In above case, in each epoch of training there will be 100 iterations.

What is shear in ImageDataGenerator?

'Shear' means that the image will be distorted along an axis, mostly to create or rectify the perception angles. It's usually used to augment images so that computers can see how humans see things from different angles.


3 Answers

Well - when you know the batch_size you could obtain number of images from flow_from_directory object:

test_batches = ImageDataGenerator().flow_from_directory(.., batch_size=n)
number_of_examples = len(test_batches.filenames)
number_of_generator_calls = math.ceil(number_of_examples / (1.0 * n)) 
# 1.0 above is to skip integer division

test_labels = []

for i in range(0,int(number_of_generator_calls)):
    test_labels.extend(np.array(test_batches[i][1]))
like image 169
Marcin Możejko Avatar answered Oct 17 '22 21:10

Marcin Możejko


If you just wants the labels, you can directly use

test_batches.labels

But some times you want the value, then you can do like this: validation_x = []

for i in range( test_batches.__len__() ):
    validation_x.extend(
        test_batches.__getitem__( i )[0] 
        )
like image 42
jingang li Avatar answered Oct 17 '22 23:10

jingang li


You can simply get a dictionary from the DirectoryIterator containing both the labels and the index in the one-hot encoding. Accessing the keys will then give you all the labels.

test_batches.class_indices.keys()
like image 16
p13rr0m Avatar answered Oct 17 '22 21:10

p13rr0m