I am trying to run SqueezeDet using tensorflow c++ api (CPU only). I have freezed tensorflow graph and loaded it from C++. While in terms of detection quality everything is fine, performance is much slower than in python. What can be the reason of that?
Simplified, my code looks like this:
int main (int argc, const char * argv[])
{
// Initializing graph
tensorflow::GraphDef graph_def;
// Folder in which graph data is located
string graph_file_name = "Model/graph.pb";
// Loading graph
tensorflow::Status graph_loaded_status = ReadBinaryProto(tensorflow::Env::Default(), graph_file_name, &graph_def);
if (!graph_loaded_status.ok())
{
cout << graph_loaded_status.ToString() << endl;
return 1;
}
unique_ptr<tensorflow::Session> session_sqdet(tensorflow::NewSession(tensorflow::SessionOptions()));
tensorflow::Status session_create_status = session_sqdet->Create(graph_def);
if (!session_create_status.ok())
{
cout << "Session create status: fail." << endl;
return 1;
}
while ()
{
/* create & preprocess batch */
session.Run({{ "image_input", input_tensor}, {"keep_prob", prob_tensor}}, {"probability/score", "bbox/trimming/bbox"}, {}, &final_output);
/* do some postprocessing */
}
}
What I have tried:
1) Using optimization flags - all are on, no warnings.
2) Using batching: performance increased, but the gap between python and C++ is still significant (running session takes 1s vs 2.4s with batch_size = 20).
Any help would be highly appreciated.
I've spent a lot of time on that problem (most of it because of stupid mistakes I made), but I finally solved it. Now I want to post here my experience as it might be useful.
So those are steps I'd advice to follow someone facing the same issue (some of them are quite obvious, though):
0) Do the profiling properly! Be sure you are using tools reliable in multicore/GPU/whatever setting you have.
1) Check that tensorflow and all related packages are built with all optimizations on.
2) Optimize the graph after freezing.
3) In case you are using different batch sizes during training and inference, make sure that you have removed all the dependencies in the model! Note that otherwise you won't have an error message or even worse performance in terms of results quality, you'll only have a mysterious slowdown!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With