How to restore a model by filename in Tensorflow r12?



I have run the distributed mnist example: https://github.com/tensorflow/tensorflow/blob/r0.12/tensorflow/tools/dist_test/python/mnist_replica.py

Though I have set the

saver = tf.train.Saver(max_to_keep=0)

In previous release, like r11, I was able to run over each check point model and evaluate the precision of the model. This gave me a plot of the progress of the precision versus global steps (or iterations).

Prior to r12, tensorflow checkpoint models were saved in two files, model.ckpt-1234 and model-ckpt-1234.meta. One could restore a model by passing the model.ckpt-1234 filename like so saver.restore(sess,'model.ckpt-1234').

However, I've noticed that in r12, there are now three output files model.ckpt-1234.data-00000-of-000001, model.ckpt-1234.index, and model.ckpt-1234.meta.

I see that the the restore documentation says that a path such as /train/path/model.ckpt should be given to restore instead of a filename. Is there any way to load one checkpoint file at a time to evaluate it? I have tried passing the model.ckpt-1234.data-00000-of-000001, model.ckpt-1234.index, and model.ckpt-1234.meta files, but get errors like below:

W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open logdir/2016-12-08-13-54/model.ckpt-0.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

NotFoundError (see above for traceback): Tensor name "hid_b" not found in checkpoint files logdir/2016-12-08-13-54/model.ckpt-0.index [[Node: save/RestoreV2_1 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_1/tensor_names, save/RestoreV2_1/shape_and_slices)]]

W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open logdir/2016-12-08-13-54/model.ckpt-0.meta: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

I'm running on OSX Sierra with tensorflow r12 installed via pip.

Any guidance would be helpful.

Thank you.

2 Answers

You can restore the model like this:

saver = tf.train.import_meta_graph('./src/models/20170512-110547/model-20170512-110547.meta')

Where the path '/src/models/20170512-110547/' contains three files:


And if in one directory there are more than one checkpoints,eg: there are checkpoint files in the path ./20170807-231648/:


you can see that there are two checkpoints, so you can use this:

saver =    tf.train.import_meta_graph('/home/tools/Tools/raoqiang/facenet/models/facenet/20170807-231648/model-20170807-231648-0.meta')

use only model.ckpt-1234

at least it works for me

