I'm a little bit confused about initial_epoch
value in fit
and fit_generator
methods. Here is the doc:
initial_epoch: Integer. Epoch at which to start training (useful for resuming a previous training run).
I understand, it is not useful if you start training from scratch. It is useful if you trained your dataset and want to improve accuracy or other values (correct me if I'm wrong). But I'm not sure what it really does.
So after all this, I have 2 questions:
initial_epoch
do and what is it for?initial_epoch
?The right number of epochs depends on the inherent perplexity (or complexity) of your dataset. A good rule of thumb is to start with a value that is 3 times the number of columns in your data. If you find that the model is still improving after all epochs complete, try again with a higher value.
fit() is for training the model with the given inputs (and corresponding training labels). evaluate() is for evaluating the already trained model using the validation (or test) data and the corresponding labels. Returns the loss value and metrics values for the model.
The steps per epoch simply indicate how many times the batch of the dataset has been fed to the network in each epoch.
number of threads generating batches in parallel. Batches are computed in parallel on the CPU and passed on the fly onto the GPU for neural network computations.
Since in some of the optimizers, some of their internal values (e.g. learning rate) are set using the current epoch
value, or even you may have (custom) callbacks that depend on the current value of epoch
, the initial_epoch
argument let you specify the initial value of epoch
to start from when training.
As stated in the documentation, this is mostly useful when you have trained your model for some epochs, say 10, and then saved it and now you want to load it and resume the training for another 10 epochs without disrupting the state of epoch-dependent objects (e.g. optimizer). So you would set initial_epoch=10
(i.e. we have trained the model for 10 epochs) and epochs=20
(not 10, since the total number of epochs to reach is 20) and then everything resume as if you were initially trained the model for 20 epochs in one single training session.
However, note that when using built-in optimizers of Keras you don't need to use initial_epoch
, since they store and update their state internally (without considering the value of current epoch) and also when saving a model the state of the optimizer will be stored as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With