Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combining Keras model.fit's `steps_per_epoch` with TensorFlow's Dataset API's `batch()`

I'm looking at the performance and GPU usage during training of a CNN model with Keras+TensorFlow. Similar to this question, I'm having a hard time to understand the combined use of Keras model.fit's steps_per_epoch and TensorFlow's Dataset API's .batch(): I set a certain batch size on the input pipeline dataset = dataset.batch(batch_size) and later I use

fit = model.fit(dataset, epochs=num_epochs, steps_per_epoch=training_set_size//batch_size)

but I see that one can actually set any number of steps per epoch, even more than training_set_size//batch_size. From the documentation I understand that on Keras an epoch is not necessarily a pass over the entire training set as usually, but anyway I'm a bit confused and now I'm not entirely sure if I'm using it right.

Is dataset.batch(batch_size) + steps_per_epoch=training_set_size//batch_size defining a minibatch SGD that runs over the entire training set by minibatches of batch_size samples? Are epochs larger than one pass over the training set if steps_per_epoch is set to more than training_set_size//batch_size?

like image 881
rsm Avatar asked Feb 07 '19 14:02

rsm


People also ask

Which method is used to train neural network Train () fit () Add () compile ()?

As with most machine learning models, artificial neural networks built with the TensorFlow library are trained using the fit method. The fit method takes 4 parameters: The x values of the training data. The y values of the training data.

What is Steps_per_epoch in Keras?

steps_per_epoch: Total number of steps (batches of samples) to yield from generator before declaring one epoch finished and starting the next epoch. It should typically be equal to the number of samples of your dataset divided by the batch size.

Which function of TF Keras model would you override if you want to write a custom training logic?

When you need to customize what fit() does, you should override the training step function of the Model class. This is the function that is called by fit() for every batch of data. You will then be able to call fit() as usual -- and it will be running your own learning algorithm.


1 Answers

steps_per_epoch is the number of batches of your set batch size is ran through the network in one epoch.

You have set your steps_per_epoch to be training_set_size//batch_size for a good reason. This ensures all data are trained upon in one epoch, providing the number divides exactly (if not it rounds by the // operator).

That is to say if you had a batch size of 10 and a training set size of 30, then steps_per_epoch = 3 ensures all data are used.

And to quote your question:

"Are epochs larger than one pass over the training set if steps_per_epoch is set to more than training_set_size//batch_size?"

Yes. Some data will be passed through again in the same epoch.

like image 122
McGuile Avatar answered Nov 15 '22 09:11

McGuile