Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the overhead associated with std::condition_variable_any

I have read in many places that there is some overhead associated with std::condition_variable_any. Just wondering, what is this overhead?

My guess here is that since this is a generic condition variable that can work with any type of lock, it requires a manually rolled implementation of waiting (perhaps with another condition_variable and mutex or futex, or something similar) so the extra overhead probably comes from that? But not sure... As opposed to just being a native wrapper around pthread_cond_wait() (and equivalent on other systems) etc.


As a followup, if I was say implementing something that waits on, say, a shared mutex, then is this type of condition variable a bad choice because of the performance overhead? What else can I do in this situation?

like image 566
Curious Avatar asked Oct 07 '17 08:10

Curious


1 Answers

pthread_cond_wait() / SleepConditionVariableSRW(), same as the the plain std::condition_variable::wait() require just a single, atomic syscall for both releasing the mutex, waiting for the condition variable and re-aquiring the mutex. The thread immediately goes to sleep and another thread - ideally one which was blocked by the mutex - can take over immediately on the same core.

With std::condition_variable_any, the unlock of the passed BasicLockable and starting to wait on the native event / condition is more than just a single syscall, it's invoking the unlock() method on the BasicLockable first and only then issues the syscall for waiting. So you have at least the overhead from the separate unlock(), plus you are more likely to trigger an less than ideal scheduling decision on the OS side. Worst case, the unlock even caused continuation of a waiting thread on a different core, with all the associated overhead.

The other way around, e.g. on spurious wakes, there are also OS side scheduling optimizations possible when dealing with a native mutex (as used in std::mutex) which don't apply with a generic BasicLockable.

Both involve some book keeping, in order to provide notify_all() logic (it's actually one event / condition per waiting thread) as well as the guarantees about all methods being atomic, so they both come with a small overhead anyway.

The real overhead comes from how well the OS can make a good scheduling decision on the combined signal-and-wait-and-lock syscall. If the OS isn't smart about the scheduling, then it makes virtually no difference.

like image 199
Ext3h Avatar answered Nov 08 '22 19:11

Ext3h