I'm working with the new tf.data.Dataset
API and I can't seem to figure out how to perform inference. Ultimately, I want to convert my model to a TensorRT graph and run it on the TX2, and all of the examples I have found assume you have a tf.placeholder
for the input. Here is pseudocode for how I am training. The [...] is just meant to be a placeholder since I didn't actually run the code. Let's not debate the model, as it is just suppose to give an example:
import tensorflow as tf
# Setup iterator
datain = tf.data.FixedLengthRecordDataset(datafiles, record_bytes1)
labels = tf.data.FixedLengthRecordDataset(labelfiles, record_bytes2)
dataset = tf.data.Dataset.zip((datain, labels))
dataset = dataset.prefetch(batch_size)
dataset = dataset.repeat(n_epoch)
iterator = dataset.make_initializable_iterator()
sess = tf.Session()
sess.run(iterator.initializer)
[batch_x, batch_y] = iterator.get_next()
# Define model function (let's not debate model except as relevant to question)
def model_fn(xin):
x0 = tf.transpose(tf.reshape(xin, [...], name='input'))
w = tf.Variable(tf.truncated_normal([...], stddev=0.1))
x1 = tf.nn.conv2d(x0, w, strides=[...], padding='VALID')
b = tf.Variable(tf.constant(0.0, shape=[...]))
x2 = tf.nn.bias_add(x1, b)
x3 = tf.nn.relu(x2, name='output')
return x3
# Setup training environment
model = model_fn(batch_x)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=model, labels=batch_y))
optimizer = tf.train.AdamOptimizer(learning_rate=1e-3).minimize(loss)
# Train Model
while True:
try:
sess.run(optimizer)
except tf.errors.OutOfRangeError:
break
# Save model
saver = tf.train.Saver(name='saver')
saver.save(sess, 'temp/path')
My question is how do I get this into TensorRT without having the input be a tf.placeholder
? All of the example I can find use a tf.placeholder
as the input. This example suggests that I can replace the iterator with a placeholder using the SavedModel
class, but I cannot seem to find any documentation on how to accomplish that.
Thanks!
EDIT: Here is my solution thanks to the help below
from tensorflow.python.tools import optimize_for_inference_lib
import uff
# You can feed data to the IteratorGetNext node using feed_dict
input_node_name = 'iterator_scope_name/IteratorGetNext'
output_node_name = 'model_scope_name/output'
# Run inference on the trained model:
graph = tf.get_default_graph()
batch_x = graph.get_tensor_by_name(input_node_name + ':0')
networkout = graph.get_tensor_by_name(output_node_name + ':0')
testdata, testlabel = custom_data_reader_fn(data_folder)
# This will evaluate the model
label = sess.run(networkout, feed_dict={batch_x: testdata})
# Freeze model and create a UFF file:
graph_def = graph.as_graph_def() # Convert the graph to a serialized pb
frozen_graph_def = tf.graph_util.convert_variables_to_constants(sess,
graph_def, [output_node_name])
opt_graph_def = optimize_for_inference_lib.optimize_for_inference(
frozen_graph_def, [input_node_name], [output_node_name],
tf.float32.as_datatype_enum)
uff.from_tensorflow(opt_graph_def, [output_node_name], quiet=False,
output_filename='opt_model.uff')
that will write out a UFF file that TensorRT can utilize. The biggest issues that I encountered was:
optimize_for_inference_lib.optimize_for_inference
operation replaced the iterator
with a tf.placeholder
IteratorGetNext
nodeSince you already have a trained graph saved in a checkpoint, in theory the simplest solution for you is to export the inference graph via optimize_for_inference
.
This tool works both for already-frozen graphs and, as is your case, for graphs with variables still defined. Assuming you go for the frozen graph way, the first step is to transform your graph's variables in constants via:
python freeze_graph.py \
--input_graph=temp/path/graph.pbtxt \
--input_checkpoint=temp/path/your_model_name.ckpt \
--output_graph=frozen_model.pb \
--output_node_names=name_of_the_output_tensor_you_want_to_use
This will generate a new binary file called frozen_model.pb
that has the Variable
operations replaced with Const
ops with the values loaded from the checkpoint file.
Then, you need to generate the inference graph with:
python optimize_for_inference.py \
--input=frozen_model.pb \
--output=inference.pb \
--frozen_graph=True \
--input_names=IteratorGetNext
--output_names=name_of_the_output_tensor_you_want_to_use
This will replace the IteratorGetNext
node with a float placeholder. You might want to choose another node, in which case just change the name. You can also change the type of the generated placeholder via the --placeholder_type_enum
option. In that case, you need to provide an integer value matching the datatype you want from the DataType
enum.
NOTE: I said "in theory" because actually inspecting the generated inception graph from a test I made it seems there are still some weird ops in there that are not really necessary for inference. You might have to further process your graph via nvidia's Graph Surgeon or TF's graph transform tool
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With