Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do I need std::atomic<bool> or is POD bool good enough?

Consider this code:

// global
std::atomic<bool> run = true;

// thread 1
while (run) { /* do stuff */ }

// thread 2
/* do stuff until it's time to shut down */
run = false;

Do I need the overhead associated with the atomic variable here? My intuition is that the read/write of a boolean variable is more or less atomic anyway (this is a common g++/Linux/Intel setup) and if there is some write/read timing weirdness, and my run loop on thread 1 stops one pass early or late as a result, I'm not super worried about it for this application.

Or is there some other consideration I am missing here? Looking at perf, it appears my code is spending a fair amount of time in std::atomic_bool::operator bool and I'd rather have it in the loop instead.

like image 839
John S Avatar asked Jun 21 '17 20:06

John S


1 Answers

You need to use std::atomic to avoid undesired optimizations (compiler reading the value once and either always looping or never looping) and to get correct behavior on systems without a strongly ordered memory model (x86 is strongly ordered, so once the write finishes, the next read will see it; on other systems, if the threads don't flush CPU cache to main RAM for other reasons, the write might not be seen for a long time, if ever).

You can improve the performance though. Default use of std::atomic uses a sequential consistency model that's overkill for a single flag value. You can speed it up by using load/store with an explicit (and less strict) memory ordering, so each load isn't required to use the most paranoid mode to maintaining consistency.

For example, you could do:

// global
std::atomic<bool> run = true;

// thread 1
while (run.load(std::memory_order_acquire)) { /* do stuff */ }

// thread 2
/* do stuff until it's time to shut down */
run.store(false, std::memory_order_release);

On an x86 machine, any ordering less strict than the (default, most strict) sequential consistency ordering typically ends up doing nothing but ensuring instructions are executed in a specific order; no bus locking or the like is required, because of the strongly ordered memory model. Thus, aside from guaranteeing the value is actually read from memory, not cached to a register and reused, using atomics this way on x86 is free, and on non-x86 machines, it makes your code correct (which it otherwise wouldn't be).

like image 119
ShadowRanger Avatar answered Sep 21 '22 06:09

ShadowRanger