Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use `transform_graph` in Tensorflow

I want to optimize my frozen trained Tensorflow model. However, I found out that the optimize_for_inference library is no longer available.

import tensorflow as tf

from tensorflow.python.tools import freeze_graph
from tensorflow.python.tools import optimize_for_inference_lib

input_graph_def = tf.GraphDef()
with tf.gfile.Open("./inference_graph/frozen_model.pb", "rb") as f:
    data = f.read()
    input_graph_def.ParseFromString(data)

output_graph_def = optimize_for_inference_lib.optimize_for_inference(
        input_graph_def,
        ["image_tensor"], ## input  
        ["'detection_boxes, detection_scores, detection_classes, num_detections"], ## outputs
        tf.float32.as_datatype_enum)

f = tf.gfile.FastGFile("./optimized_model.pb", "wb")
f.write(output_graph_def.SerializeToString())

I found the transform_graph from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/graph_transforms/README.md#strip_unused_nodes to optimize my frozen model. I was able to successfully generate a working optimized model for my object detection model. The purpose of generating an optimized version of the model is to improve inference speed of the model. I entered this code in bash (/tensorflow root directory):

bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
--in_graph=/Users/cvsanbuenaventura/Documents/tensorflow_fastlog/models/research/object_detection/inference_graph/frozen_inference_graph.pb \
--out_graph=/Users/cvsanbuenaventura/Documents/tensorflow_fastlog/models/research/object_detection/inference_graph/optimized_inference_graph-transform_graph-manyoutputs-planA2-v2.pb \
--inputs='image_tensor' \
--outputs='detection_boxes, detection_scores, detection_classes, num_detections' \
--transforms='fold_batch_norms
fold_old_batch_norms
fold_constants(ignore_errors=true)'

So my questions are:

  1. What do the transforms do? fold_batch_norms, fold_old_batch_norms, fold_constants(ignore_errors=true)
  2. I was able to successfully generate an optimized model using the three transforms above. But there are other transforms (e.g. strip_unused_nodes(type=float, shape="1,299,299,3")). What does this do? And what shape should I put here?
  3. Does the optimize_for_inference library not exist anymore?
like image 866
Chaine Avatar asked Sep 17 '18 05:09

Chaine


People also ask

What is TensorFlow transform a hybrid of?

TensorFlow transform is a hybrid of Apache Beam and TensorFlow. It's in between the two. Dataflow preprocessing only works in the context of a pipeline.

What does tf Transform do?

tf. Transform is useful for data that requires a full-pass, such as: Normalize an input value by mean and standard deviation. Convert strings to integers by generating a vocabulary over all input values.


1 Answers

I'm a bit looking for the same as you do

  1. About explanations, found this presentation, which details a bit too much; slides 14 and 15 seem to have what you want to know, on SimplifyGraph() https://web.stanford7edu/class/cs245/slides/TFGraphOptimizationsStanford.pdf

  2. This seems that the "1,299,299,3" corresponds to an SSD-300x300 model, So I guess that if there is something related to forcing data to be resized to that. I've read that the idea of optimization is removing nodes required for full training but not for inference. In my case, I'm using a 1920x1080 FRCNN model, so I guess I'll have to remove a "1,1080,1920,3".

  3. Most likely not... would have to check the changelogs of TensorFlow team.

EDIT:

  1. Made my tests finally. It seems that with Faster-RCNN (and possibly R-FCN) I don't get any benefits in inference on GPU with an 'optimized for inference' model (my reference card is a GTX Titan X Maxwell, but I also have an AGX Xavier to test). Tried a 'quantized' model with this instruction:

    ~/build/tensorflow/tf_1.12.3-cpu/bazel-bin/tensorflow/tools/graph_transforms/transform_graph --in_graph='model.cas.f01-v2_aug_frcnn-1920-1080-dia.pb' --out_graph='opt-for-inf/opt_2q_model.cas.f01-v2_aug_frcnn-1920-1080-dia.pb' --inputs="image_tensor" -- outputs="detection_boxes,detection_scores,detection_classes,num_detections" --transforms='add_default_attributes strip_unused_nodes(type=float, shape="1,1080,1920,3") remove_nodes(op=Identity, op=CheckNumerics) fold_constants(ignore_errors=true) fold_batch_norms fold_old_batch_norms merge_duplicate_nodes quantize_weights sort_by_execution_order'

And it did not make any better the inference times (let's say, going in the Xavier from 1.2 secs per inference to 0.8 or so). Adding 'quantize_nodes' gave me a mismatch on the layers of the model, which made it infeasible to use. Maybe it works differently for this topology, requiring me to explore more to see how to optimize this model for inference. It seems to work for SSDs, though; will test my own, and publish results.

  1. The thing that I know, is that if for training you have access to a Volta architecture GPU at least (Titan-V, or Tesla V100), or RTX cards, you can use an env var and train on mixed datatypes the model (FP16 when possible, with some in FP32). That makes a better model for inference, if you really don't need the precision. That would depend on the use case: for medical images, highest precision possible. Object detection of vehicles or so, I guess you can compromise precision for speed. Mixed precision training w/nVidia-CUDA: https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html#tensorflow-amp

  2. My other approach, would be trying to convert the model to TF-Lite, and see how to use inference there. It's still on my backlog.

I compiled tensorflow with bazel v0.19.x.

like image 161
zRISC Avatar answered Oct 17 '22 18:10

zRISC