Retrained inception_v3 model deployed in Cloud ML Engine always outputs the same predictions

Question

I followed the codelab TensorFlow For Poets for transfer learning using inception_v3. It generates retrained_graph.pb and retrained_labels.txt files, which can used to make predictions locally (running label_image.py).

Then, I wanted to deploy this model to Cloud ML Engine, so that I could make online predictions. For that, I had to export the retrained_graph.pb to SavedModel format. I managed to do it by following the indications in this answer from Google's @rhaertel80 and this python file from the Flowers Cloud ML Engine Tutorial. Here is my code:

import tensorflow as tf
from tensorflow.contrib import layers

from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import signature_def_utils
from tensorflow.python.saved_model import tag_constants
from tensorflow.python.saved_model import utils as saved_model_utils


export_dir = '../tf_files/saved7'
retrained_graph = '../tf_files/retrained_graph2.pb'
label_count = 5

def build_signature(inputs, outputs):
    signature_inputs = { key: saved_model_utils.build_tensor_info(tensor) for key, tensor in inputs.items() }
    signature_outputs = { key: saved_model_utils.build_tensor_info(tensor) for key, tensor in outputs.items() }

    signature_def = signature_def_utils.build_signature_def(
        signature_inputs,
        signature_outputs,
        signature_constants.PREDICT_METHOD_NAME
    )

    return signature_def

class GraphReferences(object):
  def __init__(self):
    self.examples = None
    self.train = None
    self.global_step = None
    self.metric_updates = []
    self.metric_values = []
    self.keys = None
    self.predictions = []
    self.input_jpeg = None

class Model(object):
    def __init__(self, label_count):
        self.label_count = label_count

    def build_image_str_tensor(self):
        image_str_tensor = tf.placeholder(tf.string, shape=[None])

        def decode_and_resize(image_str_tensor):
            return image_str_tensor

        image = tf.map_fn(
            decode_and_resize,
            image_str_tensor,
            back_prop=False,
            dtype=tf.string
        )

        return image_str_tensor

    def build_prediction_graph(self, g):
        tensors = GraphReferences()
        tensors.examples = tf.placeholder(tf.string, name='input', shape=(None,))
        tensors.input_jpeg = self.build_image_str_tensor()

        keys_placeholder = tf.placeholder(tf.string, shape=[None])
        inputs = {
            'key': keys_placeholder,
            'image_bytes': tensors.input_jpeg
        }

        keys = tf.identity(keys_placeholder)
        outputs = {
            'key': keys,
            'prediction': g.get_tensor_by_name('final_result:0')
        }

        return inputs, outputs

    def export(self, output_dir):
        with tf.Session(graph=tf.Graph()) as sess:
            with tf.gfile.GFile(retrained_graph, "rb") as f:
                graph_def = tf.GraphDef()
                graph_def.ParseFromString(f.read())
                tf.import_graph_def(graph_def, name="")

            g = tf.get_default_graph()
            inputs, outputs = self.build_prediction_graph(g)

            signature_def = build_signature(inputs=inputs, outputs=outputs)
            signature_def_map = {
                signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature_def
            }

            builder = saved_model_builder.SavedModelBuilder(output_dir)
            builder.add_meta_graph_and_variables(
                sess,
                tags=[tag_constants.SERVING],
                signature_def_map=signature_def_map
            )
            builder.save()

model = Model(label_count)
model.export(export_dir)

This code generates a saved_model.pb file, which I then used to create the Cloud ML Engine model. I can get predictions from this model using gcloud ml-engine predict --model my_model_name --json-instances request.json, where the contents of request.json are:

{ "key": "0", "image_bytes": { "b64": "jpeg_image_base64_encoded" } }

However, no matter which jpeg I encode in the request, I always get the exact same wrong predictions:

Prediction output

I guess the problem is in the way the CloudML Prediction API passes the base64 encoded image bytes to the input tensor "DecodeJpeg/contents:0" of inception_v3 ("build_image_str_tensor()" method in the previous code). Any clue on how can I solve this issue and have my locally retrained model serving correct predictions on Cloud ML Engine?

(Just to make it clear, the problem is not in retrained_graph.pb, as it makes correct predictions when I run it locally; nor is it in request.json, because the same request file worked without problems when following the Flowers Cloud ML Engine Tutorial pointed above.)

rhaertel80 · Accepted Answer

First, a general warning. The TensorFlow for Poets codelab was not written in a way that is very amenable to production serving (partly manifested by the workarounds you are having to implement). You would normally export a prediction-specific graph that doesn't contain all of the extra training ops. So while we can try and hack something together that works, extra work may be needed to productionize this graph.

The approach of your code appears to be to import one graph, add some placeholders, and then export the result. This is generally fine. However, in the code shown in the question, you are adding input placeholders without actually connecting them to anything in the imported graph. You end up with a graph containing multiple disconnected subgraphs, something like (excuse the crude diagram):

image_str_tensor [input=image_bytes] -> <nothing>
keys_placeholder [input=key]  -> identity [output=key]
inception_subgraph -> final_graph [output=prediction]

By inception_subgraph I mean all of the ops that you are importing.

So image_bytes is effectively a no-op and is ignored; key gets passed through; and prediction contains the result of running the inception_subgraph; since it's not using the input you are passing, it's returning the same result everytime (though I admit I actually expected an error here).

To address this problem, we would need to connect the placeholder you've created to the one that already exists in inception_subgraph to create a graph more or less like this:

image_str_tensor [input=image_bytes] -> inception_subgraph -> final_graph [output=prediction]
keys_placeholder [input=key]  -> identity [output=key]

Note that image_str_tensor is going to be a batch of images, as required by the prediction service, but the inception graph's input is actually a single image. In the interest of simplicity, we're going to address this in a hacky way: we'll assume we'll be sending images one-by-one. If we ever send more than one image per request, we'll get errors. Also, batch prediction will never work.

The main change you need is the import statement, which connects the placeholder we've added to the existing input in the graph (you'll also see the code for changing the shape of the input):

Putting it all together, we get something like:

import tensorflow as tf
from tensorflow.contrib import layers

from tensorflow.python.saved_model import builder as saved_model_builder
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import signature_def_utils
from tensorflow.python.saved_model import tag_constants
from tensorflow.python.saved_model import utils as saved_model_utils


export_dir = '../tf_files/saved7'
retrained_graph = '../tf_files/retrained_graph2.pb'
label_count = 5

class Model(object):
    def __init__(self, label_count):
        self.label_count = label_count

    def build_prediction_graph(self, g):
        inputs = {
            'key': keys_placeholder,
            'image_bytes': tensors.input_jpeg
        }

        keys = tf.identity(keys_placeholder)
        outputs = {
            'key': keys,
            'prediction': g.get_tensor_by_name('final_result:0')
        }

        return inputs, outputs

    def export(self, output_dir):
        with tf.Session(graph=tf.Graph()) as sess:
            # This will be our input that accepts a batch of inputs
            image_bytes = tf.placeholder(tf.string, name='input', shape=(None,))
            # Force it to be a single input; will raise an error if we send a batch.
            coerced = tf.squeeze(image_bytes)
            # When we import the graph, we'll connect `coerced` to `DecodeJPGInput:0`
            input_map = {'DecodeJPGInput:0': coerced}

            with tf.gfile.GFile(retrained_graph, "rb") as f:
                graph_def = tf.GraphDef()
                graph_def.ParseFromString(f.read())
                tf.import_graph_def(graph_def, input_map=input_map, name="")

            keys_placeholder = tf.placeholder(tf.string, shape=[None])

            inputs = {'image_bytes': image_bytes, 'key': keys_placeholder}

            keys = tf.identity(keys_placeholder)
            outputs = {
                'key': keys,
                'prediction': tf.get_default_graph().get_tensor_by_name('final_result:0')}    
            }

            tf.simple_save(sess, output_dir, inputs, outputs)

model = Model(label_count)
model.export(export_dir)

Retrained inception_v3 model deployed in Cloud ML Engine always outputs the same predictions

Tags:

machine-learning

tensorflow

google-cloud-platform

computer-vision

google-cloud-ml

hecforga

1 Answers

rhaertel80

Recent Activity

Donate For Us

Retrained inception_v3 model deployed in Cloud ML Engine always outputs the same predictions

Tags:

machine-learning

tensorflow

google-cloud-platform

computer-vision

google-cloud-ml

hecforga

1 Answers

rhaertel80

Related questions

Recent Activity

Donate For Us