Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get list of values in ImageDataGenerator.flow_from_directory Keras?

We can generate image dataset using ImageDataGenerator with flow_from_directory method. For calling list of class, we can use oject.classes. But, how to call list of values? I've searched and still not found any.

Thanks :)

like image 811
Ardian Avatar asked Jun 30 '17 08:06

Ardian


2 Answers

The ImageDataGenerator is a python generator, it would yield a batch of data with the shape same with your model inputs(like(batch_size,width,height,channels)) each time. The benefit of the generator is when your data set is too big, you can't put all the data to your limited memory, but, with the generator you can generate one batch data each time. and the ImageDataGenerator works with model.fit_generator(), model.predict_generator().

If you want to get the numeric data, you can use the next() function of the generator:

import numpy as np    

data_gen = ImageDataGenerator(rescale = 1. / 255)

data_generator = datagen.flow_from_directory(
    data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical')
data_list = []
batch_index = 0

while batch_index <= data_generator.batch_index:
    data = data_generator.next()
    data_list.append(data[0])
    batch_index = batch_index + 1

# now, data_array is the numeric data of whole images
data_array = np.asarray(data_list)

Alternatively, you can use PIL and numpy process the image by yourself:

from PIL import Image
import numpy as np

def image_to_array(file_path):
    img = Image.open(file_path)
    img = img.resize((img_width,img_height))
    data = np.asarray(img,dtype='float32')
    return data
    # now data is a tensor with shape(width,height,channels) of a single image.

Then, you can loop all your images with this function to get the numeric data.

Notice, I recommend you to use generator instead of get all the data directly, or, you might run out of memory.

like image 199
Craig.Li Avatar answered Nov 14 '22 23:11

Craig.Li


'But, how to call list of values' - If I understood correctly, I guess you wish to know what all files are there in your data set - if that's correct, (or if not), there are various ways you can get values from your generator:

  1. use object.filenames.

Object.filenames returns the list of all files in your target folder. I just use the len(object.filename) function to get the total number of files in my test folder. Then pass that number back into my generator and run it again.

  1. generator.n

Other way to get number of all items in your test folder is generator.n

  1. x , y = test_generator.next() to load my array and classes ( if inferred). Or a = test_generator.next(), where your array and classes will be returned as tuple.

I only used this as my test data set was really small ( 60 images) and I was using extracted features to train and predict my model( that is feature array and not the image array). If you are building a normal model, using generator to yield batches is much better way.

  1. Create a function using generator
def generate_test_data_from_directory(folder_path, image_target_size = 224, batch_size = 5, channels = 3, class_mode = 'sparse' ): 

'''fetch all out test data from directory'''
 
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
        folder_path ,
        target_size = (image_target_size, image_target_size),
        batch_size  = batch_size,
        class_mode  = class_mode)

total_images = test_generator.n  
steps = total_images//batch_size 
#iterations to cover all data, so if batch is 5, it will take total_images/5  iteration 

x , y = [] , []
for i in range(steps):
    a , b = test_generator.next()
    x.extend(a) 
    y.extend(b)
    
return np.array(x), np.array(y)
like image 32
user13827006 Avatar answered Nov 14 '22 22:11

user13827006