Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to store best models checkpoints, not only newest 5, in Tensorflow Object Detection API?

I'm training MobileNet on WIDER FACE dataset and I encountered problem I couldn't solve. TF Object Detection API stores only last 5 checkpoints in train dir, but what I would like to do, is to save best models relative to mAP metric (or at least leave many more models in train dir before deletion). For example, today I've looked at Tensorboard after next night of training and I see that overnight model has over-fitted and I can't restore best checkpoint, because it was deleted already.

EDIT: I just use Tensorflow Object Detection API, it by default saves last 5 checkpoints in train dir which I point. I look for some configuration parameter or anything that will change this behavior.

Has anyone have some fix in code/config param to set/workaround for that? It seems like I'm missing something, it should be obvious that what's in fact important is the best model, not the newest one (which can overfit).

Thanks!

like image 647
Piotr Januszewski Avatar asked Feb 05 '23 00:02

Piotr Januszewski


1 Answers

You can modify (hardcoding in your fork or opening a pull request and adding the options to protos) the arguments passed to tf.train.Saver in:

https://github.com/tensorflow/models/blob/master/research/object_detection/legacy/trainer.py#L376-L377

You will probably want to set:

  • max_to_keep: Maximum number of recent checkpoints to keep. Defaults to 5.
  • keep_checkpoint_every_n_hours: How often to keep checkpoints. Defaults to 10,000 hours.
like image 115
David de la Iglesia Avatar answered Feb 06 '23 15:02

David de la Iglesia