Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras: is there an easy way to mutate (shuffle) data in/out of the training set between epochs?

Without arguing the pros and cons of whether to actually do this, I'm curious if anyone has created or knows of a simple way to mutate the training data between epochs during the fitting of a model using keras.

Example: I have 100 vectors and output features that I'm using to train a model. I randomly pick 80 of them for the training set, setting the other 20 aside for validation, and then run:

model.fit(train_vectors,train_features,validation_data=(test_vectors,test_features)) 

Keras fitting allows one to shuffle the order of the training data with shuffle=True but this just randomly changes the order of the training data. It might be fun to randomly pick just 40 vectors from the training set, run an epoch, then randomly pick another 40 vectors, run another epoch, etc.

like image 335
AstroBen Avatar asked Jul 31 '18 22:07

AstroBen


2 Answers

https://keras.io/models/model/#fit

model.fit() has an argument steps_per_epoch. If you set shuffle=True and choose steps_per_epoch small enough you will get the behaviour that you describe.

In your example with 80 training examples: you could for instance set batch_size to 20 and steps_per_epoch to 4, or batch_size to 10 and steps_per_epoch to 8 etc.

like image 167
sdcbr Avatar answered Nov 06 '22 04:11

sdcbr


I found that specifying both steps_per_epoch and batch_size raises error. You can find correspondent code lines in the code linked below (seek if steps is not None and batch_size is not None:). Thus, we need to implement a data generator in order to realize such a behavior.

https://github.com/keras-team/keras/blob/1cf5218edb23e575a827ca4d849f1d52d21b4bb0/keras/engine/training_utils.py

like image 36
xtraky Avatar answered Nov 06 '22 03:11

xtraky