Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Save model every 10 epochs tensorflow.keras v2

I'm using keras defined as submodule in tensorflow v2. I'm training my model using fit_generator() method. I want to save my model every 10 epochs. How can I achieve this?

In Keras (not as a submodule of tf), I can give ModelCheckpoint(model_savepath,period=10). But in tf v2, they've changed this to ModelCheckpoint(model_savepath, save_freq) where save_freq can be 'epoch' in which case model is saved every epoch. If save_freq is integer, model is saved after so many samples have been processed. But I want it to be after 10 epochs. How can I achieve this?

like image 751
Nagabhushan S N Avatar asked Nov 27 '19 11:11

Nagabhushan S N


People also ask

How do you save keras model after every epoch?

Let's say for example, after epoch = 150 is over, it will be saved as model. save(model_1. h5) and after epoch = 152 , it will be saved as model. save(model_2.

How do I save model weights for each epoch?

To save weights every epoch, you can use something known as callbacks in Keras. checkpoint = ModelCheckpoint(.....) , assign the argument 'period' as 1 which assigns the periodicity of epochs. This should do it.

How do you save a keras sequential model?

There are two formats you can use to save an entire model to disk: the TensorFlow SavedModel format, and the older Keras H5 format. The recommended format is SavedModel. It is the default when you use model.save() .

How do I save a model in TensorFlow keras?

Using save_weights() method It saves the weights of the layers contained in the model. It is advised to use the save() method to save h5 models instead of save_weights() method for saving a model using tensorflow. However, h5 models can also be saved using save_weights() method.

What is the use of saved model in keras?

SavedModel is the more comprehensive save format that saves the model architecture, weights, and the traced Tensorflow subgraphs of the call functions. This enables Keras to restore both built-in layers as well as custom objects. # Create a simple model. # Train the model. # Calling `save ('my_model')` creates a SavedModel folder `my_model`.

Is it possible to load a TensorFlow graph in keras?

The function name is sufficient for loading as long as it is registered as a custom object. It's possible to load the TensorFlow graph generated by the Keras. If you do so, you won't need to provide any custom_objects. You can do so like this:

How do I save a model in TensorFlow?

model.save () or tf.keras.models.save_model () tf.keras.models.load_model () There are two formats you can use to save an entire model to disk: the TensorFlow SavedModel format, and the older Keras H5 format . The recommended format is SavedModel. It is the default when you use model.save ().

Is it possible to save model after specific epochs?

Also, saving every N epochs is not an option for me. What I am trying to do is save the model after some specific epochs are done. Let's say for example, after epoch = 150 is over, it will be saved as model.save (model_1.h5) and after epoch = 152, it will be saved as model.save (model_2.h5) etc... for few specific epochs.


3 Answers

Using tf.keras.callbacks.ModelCheckpoint use save_freq='epoch' and pass an extra argument period=10.

Although this is not documented in the official docs, that is the way to do it (notice it is documented that you can pass period, just doesn't explain what it does).

like image 175
bluesummers Avatar answered Oct 21 '22 19:10

bluesummers


Explicitly computing the number of batches per epoch worked for me.

BATCH_SIZE = 20
STEPS_PER_EPOCH = train_labels.size / BATCH_SIZE
SAVE_PERIOD = 10

# Create a callback that saves the model's weights every 10 epochs
cp_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_path, 
    verbose=1, 
    save_weights_only=True,
    save_freq= int(SAVE_PERIOD * STEPS_PER_EPOCH))

# Train the model with the new callback
model.fit(train_images, 
          train_labels,
          batch_size=BATCH_SIZE,
          steps_per_epoch=STEPS_PER_EPOCH,
          epochs=50, 
          callbacks=[cp_callback],
          validation_data=(test_images,test_labels),
          verbose=0)
like image 5
Antonio Sánchez Avatar answered Oct 21 '22 20:10

Antonio Sánchez


The param period mentioned in the accepted answer is now not available anymore.

Using the save_freq param is an alternative, but risky, as mentioned in the docs; e.g., if the dataset size changes, it may become unstable: Note that if the saving isn't aligned to epochs, the monitored metric may potentially be less reliable (again taken from the docs).

Thus, I use a subclass as a solution:

class EpochModelCheckpoint(tf.keras.callbacks.ModelCheckpoint):

    def __init__(self,
                 filepath,
                 frequency=1,
                 monitor='val_loss',
                 verbose=0,
                 save_best_only=False,
                 save_weights_only=False,
                 mode='auto',
                 options=None,
                 **kwargs):
        super(EpochModelCheckpoint, self).__init__(filepath, monitor, verbose, save_best_only, save_weights_only,
                                                   mode, "epoch", options)
        self.epochs_since_last_save = 0
        self.frequency = frequency

    def on_epoch_end(self, epoch, logs=None):
        self.epochs_since_last_save += 1
        # pylint: disable=protected-access
        if self.epochs_since_last_save % self.frequency == 0:
            self._save_model(epoch=epoch, batch=None, logs=logs)

    def on_train_batch_end(self, batch, logs=None):
        pass

use it as

callbacks=[
     EpochModelCheckpoint("/your_save_location/epoch{epoch:02d}", frequency=10),
]

Note that, dependent on your TF version, you may have to change the args in the call to the superclass __init__.

like image 2
miwe Avatar answered Oct 21 '22 20:10

miwe