Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use featurewise_center=True together with flow_from_directory in ImageDataGenerator?

I set the featurewise_center = True and then use flow_from_directory to set up my training and validation data in keras. However, i got the error

UserWarning: This ImageDataGenerator specifies `featurewise_center`, 
but it hasn't been fit on any training data. Fit it first by calling `.fit(n
numpy_data)`

Is there any means I can use flow_from_directory and then to fit the data as required ?

like image 914
Mlui Avatar asked Apr 13 '19 09:04

Mlui


People also ask

What is fill mode in ImageDataGenerator?

You can fill this in different ways like a constant value or nearest pixel values, etc. This is specified in the fill_mode argument and the default value is “nearest” which simply replaces the empty area with the nearest pixel values. # ImageDataGenerator rotation.

What is Batch_size in ImageDataGenerator?

For example, if you have 1000 images in your dataset and the batch size is defined as 10. Then the "ImageDataGenerator" will produce 10 images in each iteration of the training. An iteration is defined as steps per epoch i.e. the total number of samples / batch_size.

What is flow_from_directory?

flow_from_directory Method This method is useful when the images are sorted and placed in there respective class/label folders. This method will identify classes automatically from the folder name.

What is zoom range in ImageDataGenerator?

This method uses the zoom_range argument of the ImageDataGenerator class. We can specify the percentage value of the zooms either in a float, range in the form of an array, or python tuple. If we specify the value of the zoom-in using float value then it will be [1-floatValue, 1+floatValue].

What is the use of featurewise_Center in image generator?

featurewise_center transforms the images to 0 mean. This is done by using the formulae But for the ImageDataGenerator to do this transformation it needs to know the mean of the dataset and fit method on the ImageDataGenerator does exactly this operation of calculating these statistics.

Does imagedatagenerator support featurewise_STD_normalization on training data?

"ImageDataGenerator specifies `featurewise_std_normalization`, but it hasn't been fit on any training data." But I didn't find clear information about how to use train_dataget.fit () together with flow_from_directory.

What is the imagedatagenerator class?

The ImageDataGenerator class refers to centering that uses the mean calculated on the training dataset as feature-wise centering. It requires that the statistic is calculated on the training dataset prior to scaling.

How do I use the flow method when downloading data?

Sometimes, the datasets we download contains folders of data corresponding to the respective classes. To use the flow method, one may first need to append the data and corresponding labels into an array and then use the flow method on those arrays. Thus overall it is a tedious task.


1 Answers

featurewise_center transforms the images to 0 mean. This is done by using the formulae

X = X - mean(X)

But for the ImageDataGenerator to do this transformation it needs to know the mean of the dataset and fit method on the ImageDataGenerator does exactly this operation of calculating these statistics.

As the keras docs explain

Fits the data generator to some sample data. This computes the internal data stats related to the data-dependent transformations, based on an array of sample data.

If the dataset can be fully loaded into the memory, we can do so by loading all the images into a numpy array and running the fit on it.

Sample code (RGB images of 256x256) :

from keras.layers import Input, Dense, Flatten, Conv2D
from keras.models import Sequential
from keras.preprocessing.image import ImageDataGenerator
import numpy as np
from pathlib import Path
from PIL import Image

height = width = 256 

def read_pil_image(img_path, height, width):
        with open(img_path, 'rb') as f:
            return np.array(Image.open(f).convert('RGB').resize((width, height)))

def load_all_images(dataset_path, height, width, img_ext='png'):
    return np.array([read_pil_image(str(p), height, width) for p in 
                                    Path(dataset_path).rglob("*."+img_ext)]) 

train_datagen = ImageDataGenerator(featurewise_center=True)
train_datagen.fit(load_all_images('./images/', height, width))

train_generator = train_datagen.flow_from_directory(
        './images/',
        target_size=(height, width),
        batch_size=32,
        class_mode='binary',
        color_mode='rgb')

model = Sequential()
model.add(Conv2D(1,(3,3), input_shape=(height,width,3)))
model.add(Flatten())
model.add(Dense(1))
model.compile('adam', 'binary_crossentropy')

model.fit_generator(train_generator)

But what if the data cannot be fully loaded into memory ? One approach is to sample the images randomly from the dataset.

Normally we use mean of training data only to do mean normalization and use the same mean for normalization validation/test data. It will be bit tricky to do the same via the datagenerator.

like image 151
mujjiga Avatar answered Oct 19 '22 14:10

mujjiga