I'm writing a c++ wrapper around tensorflow 1.2 C API (for inference purposes, if it matters). Since my application is a multi-process and multi-threaded one, where resources are explicitly allocated, I would like to limit Tensorflow to only use one thread.
Currently, running a simple inference test that allows batch processing, I see it is using all CPU cores. I have tried limiting number of threads for a new session using a mixture of C and C++ as follows (forgive my partial code snippet, I hope this makes sense):
tensorflow::ConfigProto conf;
conf.set_intra_op_parallelism_threads(1);
conf.set_inter_op_parallelism_threads(1);
conf.add_session_inter_op_thread_pool()->set_num_threads(1);
std::string str;
conf.SerializeToString(&str);
TF_SetConfig(m_session_opts,(void *)str.c_str(),str.size(),m_status);
m_session = TF_NewSession(m_graph, m_session_opts, m_status);
But I don't see it is making any difference - all cores are still fully utilized.
Am I using the C API correctly?
(My current work around is to recompile Tensorflow with hard coding number of threads to be 1, which will probably work, but its obviously not the best approach...)
-- Update --
I also tried adding:
conf.set_use_per_session_threads(true);
Without success. Still multiple cores are used...
I also tried to run with high log verbosity, and got this output (showing only what I think is relevant):
tensorflow/core/common_runtime/local_device.cc:40] Local device intraop parallelism threads: 8
tensorflow/core/common_runtime/session_factory.cc:75] SessionFactory type DIRECT_SESSION accepts target:
tensorflow/core/common_runtime/direct_session.cc:95] Direct session inter op parallelism threads for pool 0: 1
The "parallelism threads: 8" message shows up as soon as I instantiate a new graph using TF_NewGraph(). I didn't find a way to specify options prior to this graph allocation though...
I had the same problem and solved it by setting the number of threads when creating the first TF session my application is creating. If the first created session is not created with a options object TF will create worker threads as the number of cores on the machine * 2.
Here is the C++ code I used:
// Call when application starts
void InitThreads(int coresToUse)
{
// initialize the number of worker threads
tensorflow::SessionOptions options;
tensorflow::ConfigProto & config = options.config;
if (coresToUse > 0)
{
config.set_inter_op_parallelism_threads(coresToUse);
config.set_intra_op_parallelism_threads(coresToUse);
config.set_use_per_session_threads(false);
}
// now create a session to make the change
std::unique_ptr<tensorflow::Session>
session(tensorflow::NewSession(options));
session->Close();
}
Pass 1 to limit the number of inter & intra threads to 1 each.
Edit: IMPORTANT NOTE: This code works when called from the main application (google sample trainer) BUT stopped working when I moved it to a DLL dedicated to wrap tensorFlow). TF 1.4.1 ignores the parameter I pass and spins up all threads. I would like to hear your comments...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With