I think it would be immensely helpful to the Tensorflow community if there was a well-documented solution to the crucial task of testing a single new image against the model created by the convnet in the CIFAR-10 tutorial.
I may be wrong, but this critical step that makes the trained model usable in practice seems to be lacking. There is a "missing link" in that tutorial—a script that would directly load a single image (as array or binary), compare it against the trained model, and return a classification.
Prior answers give partial solutions that explain the overall approach, but none of which I've been able to implement successfully. Other bits and pieces can be found here and there, but unfortunately haven't added up to a working solution. Kindly consider the research I've done, before tagging this as duplicate or already answered.
Tensorflow: how to save/restore a model?
Restoring TensorFlow model
Unable to restore models in tensorflow v0.8
https://gist.github.com/nikitakit/6ef3b72be67b86cb7868
The most popular answer is the first, in which @RyanSepassi and @YaroslavBulatov describe the problem and an approach: one needs to "manually construct a graph with identical node names, and use Saver to load the weights into it". Although both answers are helpful, it is not apparent how one would go about plugging this into the CIFAR-10 project.
A fully functional solution would be highly desirable so we could port it to other single image classification problems. There are several questions on SO in this regard that ask for this, but still no full answer (for example Load checkpoint and evaluate single image with tensorflow DNN).
I hope we can converge on a working script that everyone could use.
The below script is not yet functional, and I'd be happy to hear from you on how this can be improved to provide a solution for single-image classification using the CIFAR-10 TF tutorial trained model.
Assume all variables, file names etc. are untouched from the original tutorial.
New file: cifar10_eval_single.py
import cv2 import tensorflow as tf FLAGS = tf.app.flags.FLAGS tf.app.flags.DEFINE_string('eval_dir', './input/eval', """Directory where to write event logs.""") tf.app.flags.DEFINE_string('checkpoint_dir', './input/train', """Directory where to read model checkpoints.""") def get_single_img(): file_path = './input/data/single/test_image.tif' pixels = cv2.imread(file_path, 0) return pixels def eval_single_img(): # below code adapted from @RyanSepassi, however not functional # among other errors, saver throws an error that there are no # variables to save with tf.Graph().as_default(): # Get image. image = get_single_img() # Build a Graph. # TODO # Create dummy variables. x = tf.placeholder(tf.float32) w = tf.Variable(tf.zeros([1, 1], dtype=tf.float32)) b = tf.Variable(tf.ones([1, 1], dtype=tf.float32)) y_hat = tf.add(b, tf.matmul(x, w)) saver = tf.train.Saver() with tf.Session() as sess: sess.run(tf.initialize_all_variables()) ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir) if ckpt and ckpt.model_checkpoint_path: saver.restore(sess, ckpt.model_checkpoint_path) print('Checkpoint found') else: print('No checkpoint found') # Run the model to get predictions predictions = sess.run(y_hat, feed_dict={x: image}) print(predictions) def main(argv=None): if tf.gfile.Exists(FLAGS.eval_dir): tf.gfile.DeleteRecursively(FLAGS.eval_dir) tf.gfile.MakeDirs(FLAGS.eval_dir) eval_single_img() if __name__ == '__main__': tf.app.run()
There are two methods to feed a single new image to the cifar10 model. The first method is a cleaner approach but requires modification in the main file, hence will require retraining. The second method is applicable when a user does not want to modify the model files and instead wants to use the existing check-point/meta-graph files.
The code for the first approach is as follows:
import tensorflow as tf import numpy as np import cv2 sess = tf.Session('', tf.Graph()) with sess.graph.as_default(): # Read meta graph and checkpoint to restore tf session saver = tf.train.import_meta_graph("/tmp/cifar10_train/model.ckpt-200.meta") saver.restore(sess, "/tmp/cifar10_train/model.ckpt-200") # Read a single image from a file. img = cv2.imread('tmp.png') img = np.expand_dims(img, axis=0) # Start the queue runners. If they are not started the program will hang # see e.g. https://www.tensorflow.org/programmers_guide/reading_data coord = tf.train.Coordinator() threads = [] for qr in sess.graph.get_collection(tf.GraphKeys.QUEUE_RUNNERS): threads.extend(qr.create_threads(sess, coord=coord, daemon=True, start=True)) # In the graph created above, feed "is_training" and "imgs" placeholders. # Feeding them will disconnect the path from queue runners to the graph # and enable a path from the placeholder instead. The "img" placeholder will be # fed with the image that was read above. logits = sess.run('softmax_linear/softmax_linear:0', feed_dict={'is_training:0': False, 'imgs:0': img}) #Print classifiction results. print(logits)
The script requires that a user creates two placeholders and a conditional execution statement for it to work.
The placeholders and conditional execution statement are added in cifar10_train.py as shown below:
def train(): """Train CIFAR-10 for a number of steps.""" with tf.Graph().as_default(): global_step = tf.contrib.framework.get_or_create_global_step() with tf.device('/cpu:0'): images, labels = cifar10.distorted_inputs() is_training = tf.placeholder(dtype=bool,shape=(),name='is_training') imgs = tf.placeholder(tf.float32, (1, 32, 32, 3), name='imgs') images = tf.cond(is_training, lambda:images, lambda:imgs) logits = cifar10.inference(images)
The inputs in cifar10 model are connected to queue runner object which is a multistage queue that can prefetch data from files in parallel. See a nice animation of queue runner here
While queue runners are efficient in prefetching large dataset for training, they are an overkill for inference/testing where only a single file is needed to be classified, also they are a bit more involved to modify/maintain. For that reason, I have added a placeholder "is_training", which is set to False while training as shown below:
import numpy as np tmp_img = np.ndarray(shape=(1,32,32,3), dtype=float) with tf.train.MonitoredTrainingSession( checkpoint_dir=FLAGS.train_dir, hooks=[tf.train.StopAtStepHook(last_step=FLAGS.max_steps), tf.train.NanTensorHook(loss), _LoggerHook()], config=tf.ConfigProto( log_device_placement=FLAGS.log_device_placement)) as mon_sess: while not mon_sess.should_stop(): mon_sess.run(train_op, feed_dict={is_training: True, imgs: tmp_img})
Another placeholder "imgs" holds a tensor of shape (1,32,32,3) for the image that will be fed during inference -- the first dimension is the batch size which is one in this case. I have modified cifar model to accept 32x32 images instead of 24x24 as the original cifar10 images are 32x32.
Finally, the conditional statement feeds the placeholder or queue runner output to the graph. The "is_training" placeholder is set to False during inference and "img" placeholder is fed a numpy array -- the numpy array is reshaped from 3 to 4 dimensional vector to conform to the input tensor to inference function in the model.
That is all there is to it. Any model can be inferred with a single/user defined test data like shown in the script above. Essentially read the graph, feed data to the graph nodes and run the graph to get the final output.
Now the second method. The other approach is to hack cifar10.py and cifar10_eval.py to change batch size to one and replace the data coming from the queue runner with the one read from a file.
Set batch size to 1:
tf.app.flags.DEFINE_integer('batch_size', 1, """Number of images to process in a batch.""")
Call inference with an image file read.
def evaluate(): with tf.Graph().as_default() as g: # Get images and labels for CIFAR-10. eval_data = FLAGS.eval_data == 'test' images, labels = cifar10.inputs(eval_data=eval_data) import cv2 img = cv2.imread('tmp.png') img = np.expand_dims(img, axis=0) img = tf.cast(img, tf.float32) logits = cifar10.inference(img)
Then pass logits to eval_once and modify eval once to evaluate logits:
def eval_once(saver, summary_writer, top_k_op, logits, summary_op): ... while step < num_iter and not coord.should_stop(): predictions = sess.run([top_k_op]) print(sess.run(logits))
There is no separate script to run this method of inference, just run cifar10_eval.py which will now read a file from the user defined location with a batch size of one.
Here's how I ran a single image at a time. I'll admit it seems a bit hacky with the reuse of getting the scope.
This is a helper function
def restore_vars(saver, sess, chkpt_dir): """ Restore saved net, global score and step, and epsilons OR create checkpoint directory for later storage. """ sess.run(tf.initialize_all_variables()) checkpoint_dir = chkpt_dir if not os.path.exists(checkpoint_dir): try: os.makedirs(checkpoint_dir) except OSError: pass path = tf.train.get_checkpoint_state(checkpoint_dir) #print("path1 = ",path) #path = tf.train.latest_checkpoint(checkpoint_dir) print(checkpoint_dir,"path = ",path) if path is None: return False else: saver.restore(sess, path.model_checkpoint_path) return True
Here is the main part of the code that runs a single image at a time within the for loop.
to_restore = True with tf.Session() as sess: for i in test_img_idx_set: # Gets the image images = get_image(i) images = np.asarray(images,dtype=np.float32) images = tf.convert_to_tensor(images/255.0) # resize image to whatever you're model takes in images = tf.image.resize_images(images,256,256) images = tf.reshape(images,(1,256,256,3)) images = tf.cast(images, tf.float32) saver = tf.train.Saver(max_to_keep=5, keep_checkpoint_every_n_hours=1) #print("infer") with tf.variable_scope(tf.get_variable_scope()) as scope: if to_restore: logits = inference(images) else: scope.reuse_variables() logits = inference(images) if to_restore: restored = restore_vars(saver, sess,FLAGS.train_dir) print("restored ",restored) to_restore = False logit_val = sess.run(logits) print(logit_val)
Here is an alternative implementation to the above using place holders it's a bit cleaner in my opinion. but I'll leave the above example for historical reasons.
imgs_place = tf.placeholder(tf.float32, shape=[my_img_shape_put_here]) images = tf.reshape(imgs_place,(1,256,256,3)) saver = tf.train.Saver(max_to_keep=5, keep_checkpoint_every_n_hours=1) #print("infer") logits = inference(images) restored = restore_vars(saver, sess,FLAGS.train_dir) print("restored ",restored) with tf.Session() as sess: for i in test_img_idx_set: logit_val = sess.run(logits,feed_dict={imgs_place=i}) print(logit_val)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With