My recent efforts to implement a thread/ mutex manager ended up in an 75% CPU load (4 core), while all four running threads were either in sleep or waiting for a mutex beeing unlocked.
The specific class is far too large for being posted here entirely, but I could narrow down the cause to the deadlock-safe acquiring of two mutexes
std::unique_lock<std::mutex> lock1( mutex1, std::defer_lock ); std::unique_lock<std::mutex> lock2( mutex2, std::defer_lock ); std::lock( lock1, lock2 );
Another part of the class uses a std::condition_variable
with wait()
and notify_one()
on mutex1
for some code to be executed selectively at the same time.
The simple change to
std::unique_lock<std::mutex> lock1( mutex1 ); std::unique_lock<std::mutex> lock2( mutex2 );
brought the CPU usage down to normal 1-2%.
I Cant believe, the std::lock()
function is that inefficient. Could this be a bug in g++ 4.6.3?
edit: ( example )
#include <atomic> #include <chrono> #include <condition_variable> #include <iostream> #include <mutex> #include <thread> std::mutex mutex1, mutex2; std::condition_variable cond_var; bool cond = false; std::atomic<bool>done{false}; using namespace std::chrono_literals; void Take_Locks() { while( !done ) { std::this_thread::sleep_for( 1s ); std::unique_lock<std::mutex> lock1( mutex1, std::defer_lock ); std::unique_lock<std::mutex> lock2( mutex2, std::defer_lock ); std::lock( lock1, lock2 ); std::this_thread::sleep_for( 1s ); lock1.unlock(); lock2.unlock(); } } void Conditional_Code() { std::unique_lock<std::mutex> lock1( mutex1, std::defer_lock ); std::unique_lock<std::mutex> lock2( mutex2, std::defer_lock ); std::lock( lock1, lock2 ); std::cout << "t4: waiting \n"; while( !cond ) cond_var.wait( lock1 ); std::cout << "t4: condition met \n"; } int main() { std::thread t1( Take_Locks ), t2( Take_Locks ), t3( Take_Locks ); std::thread t4( Conditional_Code ); std::cout << "threads started \n"; std::this_thread::sleep_for( 10s ); std::unique_lock<std::mutex> lock1( mutex1 ); std::cout << "mutex1 locked \n" ; std::this_thread::sleep_for( 5s ); std::cout << "setting condition/notify \n"; cond = true; cond_var.notify_one(); std::this_thread::sleep_for( 5s ); lock1.unlock(); std::cout << "mutex1 unlocked \n"; std::this_thread::sleep_for( 6s ); done = true; t4.join(); t3.join(); t2.join(); t1.join(); }
On my machine, the following code prints out 10 times a second and consumes almost 0 cpu because most of the time the thread is either sleeping or blocked on a locked mutex:
#include <chrono> #include <thread> #include <mutex> #include <iostream> using namespace std::chrono_literals; std::mutex m1; std::mutex m2; void f1() { while (true) { std::unique_lock<std::mutex> l1(m1, std::defer_lock); std::unique_lock<std::mutex> l2(m2, std::defer_lock); std::lock(l1, l2); std::cout << "f1 has the two locks\n"; std::this_thread::sleep_for(100ms); } } void f2() { while (true) { std::unique_lock<std::mutex> l2(m2, std::defer_lock); std::unique_lock<std::mutex> l1(m1, std::defer_lock); std::lock(l2, l1); std::cout << "f2 has the two locks\n"; std::this_thread::sleep_for(100ms); } } int main() { std::thread t1(f1); std::thread t2(f2); t1.join(); t2.join(); }
Sample output:
f1 has the two locks f2 has the two locks f1 has the two locks ...
I'm running this on OS X and the Activity Monitor application says that this process is using 0.1% cpu. The machine is a Intel Core i5 (4 core).
I'm happy to adjust this experiment in any way to attempt to create live-lock or excessive cpu usage.
Update
If this program is using excessive CPU on your platform, try changing it to call ::lock()
instead, where that is defined with:
template <class L0, class L1> void lock(L0& l0, L1& l1) { while (true) { { std::unique_lock<L0> u0(l0); if (l1.try_lock()) { u0.release(); break; } } std::this_thread::yield(); { std::unique_lock<L1> u1(l1); if (l0.try_lock()) { u1.release(); break; } } std::this_thread::yield(); } }
I would be interested to know if that made any difference for you, thanks.
Update 2
After a long delay, I have written a first draft of a paper on this subject. The paper compares 4 different ways of getting this job done. It contains software you can copy and paste into your own code and test yourself (and please report back what you find!):
http://howardhinnant.github.io/dining_philosophers.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With