Is it possible to mark checkpoints not to be deleted?
A little context:
I am creating a reinforcement learning model and I want to save my best model throughout the training. In order to do that, I am keeping the best score and whenever it is updated saving a checkpoint at that moment in time.
Unfortunately, my best_score checkpoints are getting deleted. I understand that the reason is that TF only keeps the newest 5 checkpoints, and this is fine.
I want just want to keep the most 5 recent checkpoints AND the best checkpoint which might not be in the most recent five. Is there a way to do it without storing all the checkpoints?
Thank you all!
Looking at the issues posted here and here, this appears to be a requested feature which is not yet implemented. You can prevent all checkpoints from being deleted by using saver = tf.train.Saver(max_to_keep=0). If you're doing something big, then to keep this from filling up your disk I'd recommend not starting to save checkpoints until a reasonable number of steps have passed, and not saving unless the current result beats the last saved result by some minimum amount.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With