I want to use the
flow_from_directory
method of the
ImageDataGenerator
to generate training data for a regression model, where the target value can be any float value between 1 and -1.
flow_from_directory
has a "class_mode" parameter with the descripton
class_mode: one of "categorical", "binary", "sparse" or None. Default: "categorical". Determines the type of label arrays that are returned: "categorical" will be 2D one-hot encoded labels, "binary" will be 1D binary labels, "sparse" will be 1D integer labels.
Which of these values should I take? None of them seems to really fit...
Keras ImageDataGenerator is a gem! It lets you augment your images in real-time while your model is still training! You can apply any random transformations on each training image as it is passed to the model. This will not only make your model robust but will also save up on the overhead memory!
Keras Sequential neural network can be used to train the neural network. One or more hidden layers can be used with one or more nodes and associated activation functions. The final layer will need to have just one node and no activation function as the prediction need to have continuous numerical value.
Keras ImageDataGenerator with flow() Keras' ImageDataGenerator class allows the users to perform image augmentation while training the model. If you do not have sufficient knowledge about data augmentation, please refer to this tutorial which has explained the various transformation methods with examples.
With Keras 2.2.4 you can use ".flow_from_dataframe" that solves what you want to do, allowing you to flow images from a directory for regression problems. You should store all your images in a folder and load a dataframe containing in one column the image IDs and in the other column the regression score (labels) and set "class_mode='other'" in ".flow_from_dataframe".
Here you can find an example where the images are in "image_dir", the dataframe with the image IDs and the regression scores is loaded with pandas from "train file"
train_label_df = pd.read_csv(train_file, delimiter=' ', header=None, names=['id', 'score']) train_datagen = ImageDataGenerator(rescale = 1./255, horizontal_flip = True, fill_mode = "nearest", zoom_range = 0.2, width_shift_range = 0.2, height_shift_range=0.2, rotation_range=30) train_generator = train_datagen.flow_from_dataframe(dataframe=train_label_df, directory=image_dir, x_col="id", y_col="score", has_ext=True, class_mode="other", target_size=(img_width, img_height), batch_size=bs)
I think that organizing your data differently, using a DataFrame (without necessarily moving your images to new locations) will allow you to run a regression model. In short, create columns in your DataFrame containing the file path of each image and the target value. This allows your generator to keep regression values and images properly synced even when you shuffle your data at each epoch.
Here is an example showing how to link images with binomial targets, multinomial targets and regression targets just to show that "a target is a target is a target" and only the model might change:
df['path'] = df.object_id.apply(file_path_from_db_id) df object_id bi multi path target index 0 461756 dog white /path/to/imgs/756/61/blah_461756.png 0.166831 1 1161756 cat black /path/to/imgs/756/61/blah_1161756.png 0.058793 2 3303651 dog white /path/to/imgs/651/03/blah_3303651.png 0.582970 3 3367756 dog grey /path/to/imgs/756/67/blah_3367756.png -0.421429 4 3767756 dog grey /path/to/imgs/756/67/blah_3767756.png -0.706608 5 5467756 cat black /path/to/imgs/756/67/blah_5467756.png -0.415115 6 5561756 dog white /path/to/imgs/756/61/blah_5561756.png -0.631041 7 31255756 cat grey /path/to/imgs/756/55/blah_31255756.png -0.148226 8 35903651 cat black /path/to/imgs/651/03/blah_35903651.png -0.785671 9 44603651 dog black /path/to/imgs/651/03/blah_44603651.png -0.538359 10 49557622 cat black /path/to/imgs/622/57/blah_49557622.png -0.295279 11 58164756 dog grey /path/to/imgs/756/64/blah_58164756.png 0.407096 12 95403651 cat white /path/to/imgs/651/03/blah_95403651.png 0.790274 13 95555756 dog grey /path/to/imgs/756/55/blah_95555756.png 0.060669
I describe how to do this in great detail with examples here:
https://techblog.appnexus.com/a-keras-multithreaded-dataframe-generator-for-millions-of-image-files-84d3027f6f43
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With