Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Keras ImageDataGenerator in a regression model

Tags:

I want to use the

flow_from_directory 

method of the

ImageDataGenerator 

to generate training data for a regression model, where the target value can be any float value between 1 and -1.

flow_from_directory 

has a "class_mode" parameter with the descripton

class_mode: one of "categorical", "binary", "sparse" or None. Default: "categorical". Determines the type of label arrays that are returned: "categorical" will be 2D one-hot encoded labels, "binary" will be 1D binary labels, "sparse" will be 1D integer labels.

Which of these values should I take? None of them seems to really fit...

like image 460
user1934212 Avatar asked Jan 19 '17 18:01

user1934212


People also ask

What is ImageDataGenerator keras use for?

Keras ImageDataGenerator is a gem! It lets you augment your images in real-time while your model is still training! You can apply any random transformations on each training image as it is passed to the model. This will not only make your model robust but will also save up on the overhead memory!

What is keras in regression?

Keras Sequential neural network can be used to train the neural network. One or more hidden layers can be used with one or more nodes and associated activation functions. The final layer will need to have just one node and no activation function as the prediction need to have continuous numerical value.

What is ImageDataGenerator flow?

Keras ImageDataGenerator with flow() Keras' ImageDataGenerator class allows the users to perform image augmentation while training the model. If you do not have sufficient knowledge about data augmentation, please refer to this tutorial which has explained the various transformation methods with examples.


2 Answers

With Keras 2.2.4 you can use ".flow_from_dataframe" that solves what you want to do, allowing you to flow images from a directory for regression problems. You should store all your images in a folder and load a dataframe containing in one column the image IDs and in the other column the regression score (labels) and set "class_mode='other'" in ".flow_from_dataframe".

Here you can find an example where the images are in "image_dir", the dataframe with the image IDs and the regression scores is loaded with pandas from "train file"

train_label_df = pd.read_csv(train_file, delimiter=' ', header=None, names=['id', 'score'])  train_datagen = ImageDataGenerator(rescale = 1./255, horizontal_flip = True,                                    fill_mode = "nearest", zoom_range = 0.2,                                    width_shift_range = 0.2, height_shift_range=0.2,                                    rotation_range=30)   train_generator = train_datagen.flow_from_dataframe(dataframe=train_label_df, directory=image_dir,                                                x_col="id", y_col="score", has_ext=True,                                                class_mode="other", target_size=(img_width, img_height),                                                batch_size=bs) 
like image 148
Gian Avatar answered Sep 21 '22 14:09

Gian


I think that organizing your data differently, using a DataFrame (without necessarily moving your images to new locations) will allow you to run a regression model. In short, create columns in your DataFrame containing the file path of each image and the target value. This allows your generator to keep regression values and images properly synced even when you shuffle your data at each epoch.

Here is an example showing how to link images with binomial targets, multinomial targets and regression targets just to show that "a target is a target is a target" and only the model might change:

df['path'] = df.object_id.apply(file_path_from_db_id) df         object_id   bi  multi                                    path     target index                                                                0         461756  dog  white    /path/to/imgs/756/61/blah_461756.png   0.166831 1        1161756  cat  black   /path/to/imgs/756/61/blah_1161756.png   0.058793 2        3303651  dog  white   /path/to/imgs/651/03/blah_3303651.png   0.582970 3        3367756  dog   grey   /path/to/imgs/756/67/blah_3367756.png  -0.421429 4        3767756  dog   grey   /path/to/imgs/756/67/blah_3767756.png  -0.706608 5        5467756  cat  black   /path/to/imgs/756/67/blah_5467756.png  -0.415115 6        5561756  dog  white   /path/to/imgs/756/61/blah_5561756.png  -0.631041 7       31255756  cat   grey  /path/to/imgs/756/55/blah_31255756.png  -0.148226 8       35903651  cat  black  /path/to/imgs/651/03/blah_35903651.png  -0.785671 9       44603651  dog  black  /path/to/imgs/651/03/blah_44603651.png  -0.538359 10      49557622  cat  black  /path/to/imgs/622/57/blah_49557622.png  -0.295279 11      58164756  dog   grey  /path/to/imgs/756/64/blah_58164756.png   0.407096 12      95403651  cat  white  /path/to/imgs/651/03/blah_95403651.png   0.790274 13      95555756  dog   grey  /path/to/imgs/756/55/blah_95555756.png   0.060669 

I describe how to do this in great detail with examples here:

https://techblog.appnexus.com/a-keras-multithreaded-dataframe-generator-for-millions-of-image-files-84d3027f6f43

like image 30
timehaven Avatar answered Sep 23 '22 14:09

timehaven