I'm trying to restore TensorFlow model. I followed this example: http://nasdag.github.io/blog/2016/01/19/classifying-bees-with-google-tensorflow/
At the end of the code in the example I added these lines:
saver = tf.train.Saver()
save_path = saver.save(sess, "model.ckpt")
print("Model saved in file: %s" % save_path)
Two files were created: checkpoint and model.ckpt.
In a new python file (tomas_bees_predict.py), I have this code:
import tensorflow as tf
saver = tf.train.Saver()
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "model.ckpt")
print("Model restored.")
However when I execute the code, I get this error:
Traceback (most recent call last):
File "tomas_bees_predict.py", line 3, in <module>
saver = tf.train.Saver()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 705, in __init__
raise ValueError("No variables to save")
ValueError: No variables to save
Is there a way to read mode.ckpt file and see what variables are saved? Or maybe someone can help with saving the model and restoring it based on the example described above?
EDIT 1:
I think I tried running the same code in order to recreate model structure and I was getting the error. I think it could be related to the fact that code described here isn't using named variables: http://nasdag.github.io/blog/2016/01/19/classifying-bees-with-google-tensorflow/
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
return tf.Variable(initial)
So I did this experiment. I wrote two versions of the code (with and without named variables) to save the model and the code to restore the model.
tensor_save_named_vars.py:
import tensorflow as tf
# Create some variables.
v1 = tf.Variable(1, name="v1")
v2 = tf.Variable(2, name="v2")
# Add an op to initialize the variables.
init_op = tf.initialize_all_variables()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
print "v1 = ", v1.eval()
print "v2 = ", v2.eval()
# Save the variables to disk.
save_path = saver.save(sess, "/tmp/model.ckpt")
print "Model saved in file: ", save_path
tensor_save_not_named_vars.py:
import tensorflow as tf
# Create some variables.
v1 = tf.Variable(1)
v2 = tf.Variable(2)
# Add an op to initialize the variables.
init_op = tf.initialize_all_variables()
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, initialize the variables, do some work, save the
# variables to disk.
with tf.Session() as sess:
sess.run(init_op)
print "v1 = ", v1.eval()
print "v2 = ", v2.eval()
# Save the variables to disk.
save_path = saver.save(sess, "/tmp/model.ckpt")
print "Model saved in file: ", save_path
tensor_restore.py:
import tensorflow as tf
# Create some variables.
v1 = tf.Variable(0, name="v1")
v2 = tf.Variable(0, name="v2")
# Add ops to save and restore all the variables.
saver = tf.train.Saver()
# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
# Restore variables from disk.
saver.restore(sess, "/tmp/model.ckpt")
print "Model restored."
print "v1 = ", v1.eval()
print "v2 = ", v2.eval()
Here is what I get when I execute this code:
$ python tensor_save_named_vars.py
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4
v1 = 1
v2 = 2
Model saved in file: /tmp/model.ckpt
$ python tensor_restore.py
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4
Model restored.
v1 = 1
v2 = 2
$ python tensor_save_not_named_vars.py
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4
v1 = 1
v2 = 2
Model saved in file: /tmp/model.ckpt
$ python tensor_restore.py
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 4
I tensorflow/core/common_runtime/direct_session.cc:58] Direct session inter op parallelism threads: 4
W tensorflow/core/common_runtime/executor.cc:1076] 0x7ff953881e40 Compute status: Not found: Tensor name "v2" not found in checkpoint files /tmp/model.ckpt
[[Node: save/restore_slice_1 = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restore_slice_1/tensor_name, save/restore_slice_1/shape_and_slice)]]
W tensorflow/core/common_runtime/executor.cc:1076] 0x7ff953881e40 Compute status: Not found: Tensor name "v1" not found in checkpoint files /tmp/model.ckpt
[[Node: save/restore_slice = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restore_slice/tensor_name, save/restore_slice/shape_and_slice)]]
Traceback (most recent call last):
File "tensor_restore.py", line 14, in <module>
saver.restore(sess, "/tmp/model.ckpt")
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 891, in restore
sess.run([self._restore_op_name], {self._filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 368, in run
results = self._do_run(target_list, unique_fetch_targets, feed_dict_string)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 444, in _do_run
e.code)
tensorflow.python.framework.errors.NotFoundError: Tensor name "v2" not found in checkpoint files /tmp/model.ckpt
[[Node: save/restore_slice_1 = RestoreSlice[dt=DT_INT32, preferred_shard=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/restore_slice_1/tensor_name, save/restore_slice_1/shape_and_slice)]]
Caused by op u'save/restore_slice_1', defined at:
File "tensor_restore.py", line 8, in <module>
saver = tf.train.Saver()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 713, in __init__
restore_sequentially=restore_sequentially)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 432, in build
filename_tensor, vars_to_save, restore_sequentially, reshape)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 191, in _AddRestoreOps
values = self.restore_op(filename_tensor, vs, preferred_shard)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 106, in restore_op
preferred_shard=preferred_shard)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/io_ops.py", line 189, in _restore_slice
preferred_shard, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 271, in _restore_slice
preferred_shard=preferred_shard, name=name)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 664, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1834, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1043, in __init__
self._traceback = _extract_stack()
So perhaps the original code (see the external link above) could be modified to something like this:
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.1)
weight_var = tf.Variable(initial, name="weight_var")
return weight_var
def bias_variable(shape):
initial = tf.constant(0.1, shape=shape)
bias_var = tf.Variable(initial, name="bias_var")
return bias_var
But then the question I have: is restoring weight_var and bias_var variables sufficient to implement the prediction? I did the training on the powerful machine with GPU and I would like to copy the model to the less powerful computer without GPU to run predictions.
There's a similar question here: Tensorflow: how to save/restore a model? TLDR; you need to recreate model structure using same sequence of TensorFlow API commands before using Saver object to restore the weights
This is suboptimal, follow Github issue #696 for progress on making this easier
If a problem like this occurs then try to restart your kernel as the current variable overwrites the previous causing conflict between them, thus it shows notFoundError and other issues come up.
I encountered the same type of problem and restarting the kernel worked for me. (Caution: Try avoiding running your kernel multiple times as it can ruin your model file recreating variables that overwrite the existing one thus end up changing the original values.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With