Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what does class_mode parameter in Keras image_gen.flow_from_directory() signify?

train_image_gen = image_gen.flow_from_directory('/Users/harshpanwar/Desktop/Folder/train',
                                               target_size=image_shape[:2],
                                               batch_size=batch_size,
                                               class_mode='binary')

In the above code snippet what does class_mode='binary' signify. I think it is for the number of categories of images. I am using this code for training a image recognition classifier in Keras to classify between 2 different categories like dog and cat. So if class_mode='binary' is for signifying two categories how do we make it for three or more?

like image 360
Harsh Panwar Avatar asked Dec 21 '19 19:12

Harsh Panwar


People also ask

What is Class_mode in keras?

class_mode : One of "categorical", "binary", "sparse", "input", or None. Default: "categorical".

What is flow_from_directory?

The flow_from_directory() method allows you to read the images directly from the directory and augment them while the neural network model is learning on the training data. The method expects that images belonging to different classes are present in different folders but are inside the same parent folder.

What is target size in keras?

Keras has this function called flow_from_directory and one of the parameters is called target_size. Here is the explanation for it: target_size: Tuple of integers (height, width), default: (256, 256). The dimensions to which all images found will be resized.

What is ImageDataGenerator in keras?

The ImageDataGenerator class in Keras uses this technique to generate randomly rotated images in which the angle can range from 0 degrees to 360 degrees. Our example goes like this – The first step is to import the necessary libraries and load the image. The next step is to convert the image to an array for processing.


2 Answers

class_mode: One of "categorical", "binary", "sparse", "input", or None. Default: "categorical". Determines the type of label arrays that are returned: - "categorical" will be 2D one-hot encoded labels, - "binary" will be 1D binary labels, "sparse" will be 1D integer labels, - "input" will be images identical to input images (mainly used to work with autoencoders). - If None, no labels are returned (the generator will only yield batches of image data, which is useful to use with model.predict_generator()). Please note that in case of class_mode None, the data still needs to reside in a subdirectory of directory for it to work correctly.

like image 145
Soheil Hosseini Avatar answered Sep 19 '22 11:09

Soheil Hosseini


Say you have N classes in your dataset. If you have 4 labels, dog (index 0), cat (1), donkey (2) and human (3), N would be 4.

Class modes:

  • "categorical": 2D output (aka. list of numbers of length N), [0, 0, 1, 0], which is a one-hot encoding (only one number is 1/ "hot") representing the donkey. This is for mutually exclusive labels. A dog cannot be a cat, a human is not a dog.
  • "binary": 1D output (aka. 1 number), which is either 0, 1, 2, 3 ... N. It is called this because it is binary if there are only two classes (IMHO this is a bad reason), source. I suggest using "binary" just for single label classification, because it documents-in-code, your intention.
  • "sparse": After digging in the code, this is the same as "binary". The logic is done with elif self.class_mode in {'binary', 'sparse'}:, and the class_mode is not used after that. I suggest using "sparse" for multilabel classification though, again because it documents-in-code, your intention.
  • "input": The label is literally the image again. So the label for an image of the dog, is the same dog picture array. If I knew more about autoencoders I might have been able to explain further.
  • None: No labels, therefore not useful for training, but for inference/ prediction.

The TensorFlow documentation is here but I think it should go into more depth for class_mode:

One of "categorical", "binary", "sparse", "input", or None. Default: "categorical". Determines the type of label arrays that are returned: - "categorical" will be 2D one-hot encoded labels, - "binary" will be 1D binary labels, "sparse" will be 1D integer labels, - "input" will be images identical to input images (mainly used to work with autoencoders). - If None, no labels are returned (the generator will only yield batches of image data, which is useful to use with model.predict()). Please note that in case of class_mode None, the data still needs to reside in a subdirectory of directory for it to work correctly.


Sparse is the same as binary?:

As you can see in my search results, sparse is only checked twice (line 2 and 4 in search results). I believe the intention of "sparse" is for multi-label classification, and "binary" is designed for single-label classification (Hot-dog vs. No hotdog), but currently there is no difference, since the behaviour is the same: enter image description here

like image 26
Ben Butterworth Avatar answered Sep 21 '22 11:09

Ben Butterworth