Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding `width_shift_range` and `height_shift_range` arguments in Keras's ImageDataGenerator class

The Keras documentation of ImageDataGenerator class says—

width_shift_range: Float, 1-D array-like or int - float: fraction of total width, if < 1, or pixels if >= 1. - 1-D array-like: random elements from the array. - int: integer number of pixels from interval (-width_shift_range, +width_shift_range) - With width_shift_range=2 possible values are integers [-1, 0, +1], same as with width_shift_range=[-1, 0, +1], while with width_shift_range=1.0 possible values are floats in the interval [-1.0, +1.0).

height_shift_range: Float, 1-D array-like or int - float: fraction of total height, if < 1, or pixels if >= 1. - 1-D array-like: random elements from the array. - int: integer number of pixels from interval (-height_shift_range, +height_shift_range) - With height_shift_range=2 possible values are integers [-1, 0, +1], same as with height_shift_range=[-1, 0, +1], while with height_shift_range=1.0 possible values are floats in the interval [-1.0, +1.0).

I’m new in Keras and machine learning, and I just have started learning it.

I am struggling to understand the documentation and use of these two arguments of Keras ImageDataGenerator class, named width_shift_range and height_shift_range. I have searched out a lot, but couldn't find any good documentation other than the official. What exactly do these two arguments do? When have to use them?

This talk may seem inappropriate here, but since there is no discussion anywhere on the internet, I think it would be nice to have the discussion here.

If anyone helps me understanding these, I would be grateful. Thank you very much.

like image 644
Arafat Hasan Avatar asked Jun 20 '20 10:06

Arafat Hasan


People also ask

What is Width_shift_range?

The width_shift_range is a floating point number between 0.0 and 1.0 which specifies the upper bound of the fraction of the total width by which the image is to be randomly shifted, either towards the left or right.

What is Batch_size in ImageDataGenerator?

For example, if you have 1000 images in your dataset and the batch size is defined as 10. Then the "ImageDataGenerator" will produce 10 images in each iteration of the training. An iteration is defined as steps per epoch i.e. the total number of samples / batch_size.

What is Shear_range in ImageDataGenerator?

shear_range=0.2 means shear the image by 20%. zoom_range means zoom-in and zoom-out by 20%. For mirror reflection, I have given horizontal_flip=True . The most important argument of ImageDataGenerator is fill_mode . When your image shift by 20% there is some space left over.

What is Rotation_range in ImageDataGenerator?

Random Rotations ImageDataGenerator class allows you to randomly rotate images through any degree between 0 and 360 by providing an integer value in the rotation_range argument.


1 Answers

These two argument used by ImageDataGenerator class Which use to preprocess image before feeding it into network. If you want to make your model more robust then small amount of data is not enough. That is where data augmentation come in handy. This are used to generate random data.

width_shift_range: It actually shift the image to the left or right(horizontal shifts). If the value is float and <=1 it will take the percentage of total width as range. Suppose image width is 100px. if width_shift_range = 1.0 it will take -100% to +100% means -100px to +100px. It will shift image randomly between this range. Randomly selected positive value will shift the image to the right side and negative value will shift the image to the left side. We can also do this by selecting pixels. if we set width_shift_range = 100 it will have the same effect. More importantly integer value>=1 count pixel as range and float value<=1 count percentage of total width as range. Below images are for width_shift_range = 1.0.

For value 1

height_shift_range: It works same as width_shift_range but shift vertically(up or down). Below images are for height_shift_range=0.2,fill_mode="constant"

enter image description here

fill_mode: It sets rules for newly shifted pixel in the input area.

## fill_mode: One of {"constant", "nearest", "reflect" or "wrap"}. 
## Points outside the boundaries of the input are filled according to the given mode:
## "constant": kkkkkkkk|abcd|kkkkkkkk (cval=k)
## "nearest":  aaaaaaaa|abcd|dddddddd
## "reflect":  abcddcba|abcd|dcbaabcd
## "wrap":  abcdabcd|abcd|abcdabcd

For more you can check this blog

like image 119
Sayed Sohan Avatar answered Sep 29 '22 14:09

Sayed Sohan