We can generate image dataset using ImageDataGenerator with flow_from_directory method. For calling list of class, we can use oject.classes. But, how to call list of values? I've searched and still not found any.
Thanks :)
The ImageDataGenerator is a python generator, it would yield a batch of data with the shape same with your model inputs(like(batch_size,width,height,channels)
) each time. The benefit of the generator is when your data set is too big, you can't put all the data to your limited memory, but, with the generator you can generate one batch data each time. and the ImageDataGenerator works with model.fit_generator(), model.predict_generator()
.
If you want to get the numeric data, you can use the next()
function of the generator:
import numpy as np
data_gen = ImageDataGenerator(rescale = 1. / 255)
data_generator = datagen.flow_from_directory(
data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical')
data_list = []
batch_index = 0
while batch_index <= data_generator.batch_index:
data = data_generator.next()
data_list.append(data[0])
batch_index = batch_index + 1
# now, data_array is the numeric data of whole images
data_array = np.asarray(data_list)
Alternatively, you can use PIL
and numpy
process the image by yourself:
from PIL import Image
import numpy as np
def image_to_array(file_path):
img = Image.open(file_path)
img = img.resize((img_width,img_height))
data = np.asarray(img,dtype='float32')
return data
# now data is a tensor with shape(width,height,channels) of a single image.
Then, you can loop all your images with this function to get the numeric data.
Notice, I recommend you to use generator instead of get all the data directly, or, you might run out of memory.
'But, how to call list of values' - If I understood correctly, I guess you wish to know what all files are there in your data set - if that's correct, (or if not), there are various ways you can get values from your generator:
Object.filenames returns the list of all files in your target folder. I just use the len(object.filename) function to get the total number of files in my test folder. Then pass that number back into my generator and run it again.
Other way to get number of all items in your test folder is generator.n
I only used this as my test data set was really small ( 60 images) and I was using extracted features to train and predict my model( that is feature array and not the image array). If you are building a normal model, using generator to yield batches is much better way.
def generate_test_data_from_directory(folder_path, image_target_size = 224, batch_size = 5, channels = 3, class_mode = 'sparse' ): '''fetch all out test data from directory''' test_datagen = ImageDataGenerator(rescale=1./255) test_generator = test_datagen.flow_from_directory( folder_path , target_size = (image_target_size, image_target_size), batch_size = batch_size, class_mode = class_mode) total_images = test_generator.n steps = total_images//batch_size #iterations to cover all data, so if batch is 5, it will take total_images/5 iteration x , y = [] , [] for i in range(steps): a , b = test_generator.next() x.extend(a) y.extend(b) return np.array(x), np.array(y)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With