What's the difference between "samples_per_epoch" and "steps_per_epoch" in fit_generator

Tags:

keras

I was confused by this problem for several days...

My question is that why the training time has such massive difference between that I set the batch_size to be "1" and "20" for my generator.

If I set the batch_size to be 1, the training time of 1 epoch is approximately 180 ~ 200 sec. If I set the batch_size to be 20, the training time of 1 epoch is approximately 3000 ~ 3200 sec.

However, this horrible difference between these training times seems to be abnormal..., since it should be the reversed result: batch_size = 1, training time -> 3000 ~ 3200 sec. batch_size = 20, training time -> 180 ~ 200 sec.

The input to my generator is not the file path, but the numpy arrays which are already loaded into the memory via calling "np.load()". So I think the I/O trade-off issue doesn't exist.

I'm using Keras-2.0.3 and my backend is tensorflow-gpu 1.0.1

I have seen the update of this merged PR, but it seems that this change won't affect anything at all. (the usage is just the same with original one)

The link here is the gist of my self-defined generator and the part of my fit_generator.

268

asked Apr 17 '17 19:04

HappyStorm

2 Answers

When you use fit_generator, the number of samples processed for each epoch is batch_size * steps_per_epochs. From the Keras documentation for fit_generator: https://keras.io/models/sequential/

steps_per_epoch: Total number of steps (batches of samples) to yield from generator before declaring one epoch finished and starting the next epoch. It should typically be equal to the number of unique samples of your dataset divided by the batch size.

This is different from the behaviour of 'fit', where increasing batch_size typically speeds up things.

In conclusion, when you increase batch_size with fit_generator, you should decrease steps_per_epochs by the same factor, if you want training time to stay the same or lower.

answered Sep 18 '22 23:09

pgrenholm

Let's clear it :

Assume you have a dataset with 8000 samples (rows of data) and you choose a batch_size = 32 and epochs = 25

This means that the dataset will be divided into (8000/32) = 250 batches, having 32 samples/rows in each batch. The model weights will be updated after each batch.

one epoch will train 250 batches or 250 updations to the model.

here steps_per_epoch = no.of batches

With 50 epochs, the model will pass through the whole dataset 50 times.

Ref - https://machinelearningmastery.com/difference-between-a-batch-and-an-epoch/

enter image description here

answered Sep 17 '22 23:09

AbtabM

Related questions
                            
                                keras vs. tensorflow.python.keras - which one to use?
                            
                                How do I mask a loss function in Keras with the TensorFlow backend?
                            
                                Understanding Keras LSTMs: Role of Batch-size and Statefulness
                            
                                Create keras callback to save model predictions and targets for each batch during training
                            
                                Keras flowFromDirectory get file names as they are being generated
                            
                                Loading model with custom loss + keras
                            
                                How to convert keras(h5) file to a tflite file?
                            
                                How to use advanced activation layers in Keras?
                            
                                ValueError: Layer sequential_20 expects 1 inputs, but it received 2 input tensors
                            
                                Keras: model.predict for a single image
                            
                                How do you get the name of the tensorflow output nodes in a Keras Model?
                            
                                Error when checking model input: expected lstm_1_input to have 3 dimensions, but got array with shape (339732, 29)
                            
                                Multivariate LSTM with missing values
                            
                                Is there an easy way to get something like Keras model.summary in Tensorflow?
                            
                                Can't save custom subclassed model
                            
                                Why is my GPU slower than CPU when training LSTM/RNN models?
                            
                                what is the difference between Flatten() and GlobalAveragePooling2D() in keras
                            
                                Error "Keras requires TensorFlow 2.2 or higher"
                            
                                What is the definition of a non-trainable parameter?
                            
                                How to work with multiple inputs for LSTM in Keras?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With