Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow Object Detection API has slow inference time with tensorflow serving

I am unable to match the inference times reported by Google for models released in their model zoo. Specifically I am trying out their faster_rcnn_resnet101_coco model where the reported inference time is 106ms on a Titan X GPU.

My serving system is using TF 1.4 running in a container built from the Dockerfile released by Google. My client is modeled after the inception client also released by Google.

I am running on an Ubuntu 14.04, TF 1.4 with 1 Titan X. My total inference time is 3x worse than reported by Google ~330ms. Making the tensor proto is taking ~150ms and Predict is taking ~180ms. My saved_model.pb is directly from the tar file downloaded from the model zoo. Is there something I am missing? What steps can I take to reduce the inference time?

like image 654
Sid M Avatar asked Dec 19 '17 23:12

Sid M


2 Answers

I was able to solve the two problems by

  1. optimizing the compiler flags. Added the following to bazel-bin --config=opt --copt=-msse4.1 --copt=-msse4.2 --copt=-mavx --copt=-mavx2 --copt=-mfma

  2. Not importing tf.contrib for every inference. In the inception_client sample provided by google, these lines re-import tf.contrib for every forward pass.

like image 70
Sid M Avatar answered Nov 03 '22 03:11

Sid M


Non-max suppression may be the bottleneck: https://github.com/tensorflow/models/issues/2710.

Is the image size 600x600?

like image 2
Vikram Gupta Avatar answered Nov 03 '22 04:11

Vikram Gupta