I am training an image classification CNN using Keras.
Using the ImageDataGenerator
function, I apply some random transformations to the training images (e.g. rotation, shearing, zooming).
My understanding is, that these transformations are applied randomly to each image before passed to the model.
But some things are not clear to me:
1) How can I make sure that specific rotations of an image (e.g. 90°, 180°, 270°) are ALL included while training.
2) The steps_per_epoch
parameter of model.fit_generator
should be set to the
number of unique samples of the dataset divided by the batch size define in the flow_from_directory
method. Does this still apply when using the above mentioned image augmentation methods, since they increase the number of training images?
Thanks, Mario
In Keras model, steps_per_epoch is an argument to the model's fit function. Steps_per_epoch is the quotient of total training samples by batch size chosen. As the batch size for the dataset increases the steps per epoch reduce simultaneously and vice-versa.
An epoch consists of one full cycle through the training data. This is usually many steps. As an example, if you have 2,000 images and use a batch size of 10 an epoch consists of: 2,000 images / (10 images / step) = 200 steps.
Keras ImageDataGenerator is used for getting the input of the original data and further, it makes the transformation of this data on a random basis and gives the output resultant containing only the data that is newly transformed.
Some time ago I raised myself the same questions and I think a possible explanation is here:
Consider this example:
aug = ImageDataGenerator(rotation_range=90, width_shift_range=0.1,
height_shift_range=0.1, shear_range=0.2,
zoom_range=0.2, horizontal_flip=True,
fill_mode="nearest")
For question 1): I specify a rotation_range=90, which means that while you flow (retrieve) the data, the generator will randomly rotate the image by a degree between 0 and 90 deg. You can not specify an exact angle cause that's what ImageDataGenerator does: generate randomly the rotation. It is also very important concerning your second question.
For question 2): Yes it still applies to the data augmentation method. I was also confused in the beginning. The reason is that since the image is generated randomly, for each epoch, the network sees the images all different from those in previous epoch. That's why the data is "augmented" - the augmentation is not done within an epoch, but throughout the entire training process. However, I have seen other people specifying 2x value of the original steps_per_epoch.
Hope this helps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With