I'm looking at the performance and GPU usage during training of a CNN model with Keras+TensorFlow. Similar to this question, I'm having a hard time to understand the combined use of Keras model.fit
's steps_per_epoch
and TensorFlow's Dataset API's .batch()
: I set a certain batch size on the input pipeline dataset = dataset.batch(batch_size)
and later I use
fit = model.fit(dataset, epochs=num_epochs, steps_per_epoch=training_set_size//batch_size)
but I see that one can actually set any number of steps per epoch, even more than training_set_size//batch_size
. From the documentation I understand that on Keras an epoch is not necessarily a pass over the entire training set as usually, but anyway I'm a bit confused and now I'm not entirely sure if I'm using it right.
Is dataset.batch(batch_size)
+ steps_per_epoch=training_set_size//batch_size
defining a minibatch SGD that runs over the entire training set by minibatches of batch_size
samples? Are epochs larger than one pass over the training set if steps_per_epoch
is set to more than training_set_size//batch_size
?
As with most machine learning models, artificial neural networks built with the TensorFlow library are trained using the fit method. The fit method takes 4 parameters: The x values of the training data. The y values of the training data.
steps_per_epoch: Total number of steps (batches of samples) to yield from generator before declaring one epoch finished and starting the next epoch. It should typically be equal to the number of samples of your dataset divided by the batch size.
When you need to customize what fit() does, you should override the training step function of the Model class. This is the function that is called by fit() for every batch of data. You will then be able to call fit() as usual -- and it will be running your own learning algorithm.
steps_per_epoch
is the number of batches of your set batch size is ran through the network in one epoch.
You have set your steps_per_epoch
to be training_set_size//batch_size
for a good reason. This ensures all data are trained upon in one epoch, providing the number divides exactly (if not it rounds by the // operator).
That is to say if you had a batch size of 10 and a training set size of 30, then steps_per_epoch = 3
ensures all data are used.
And to quote your question:
"Are epochs larger than one pass over the training set if steps_per_epoch is set to more than training_set_size//batch_size?"
Yes. Some data will be passed through again in the same epoch.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With