While using the tf.train.MonitoredTrainingSession, is it possible to save all the checkpoints.
It has a parameter (save_checkpoint_secs=600) to specify after how much we want to save a checkpoint but there is no option to specify how many checkpoints you can save.
While using the simple tf.train.Saver(), there is an option to specify max_to_keep.
You can pass a tf.train.Saver using a tf.train.Scaffold to a tf.train.MonitoredTrainingSession:
import tensorflow as tf
scaffold = tf.train.Scaffold(saver=tf.train.Saver(max_to_keep=10))
with tf.train.MonitoredTrainingSession(scaffold=scaffold) as sess:
...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With