TL;DR: I need the Microsoft C (not C++) equivalent of C11's atomic_load
. Anyone know what the right function is?
I have some pretty standard code which uses atomics. Something like
do {
bar = atomic_load(&foo);
baz = some_stuff(bar);
} while (!atomic_compare_exchange_weak(&foo, &bar, baz));
I'm trying to figure out how to handle it with MSVC. The CAS is easy enough (InterlockedCompareExchange
), but atomic_load
is proving more troublesome.
Maybe I'm missing something, but the Synchronization Functions list on MSDN doesn't seem to have anything for a simple load. The only thing I can think of would be something like InterlockedOr(object, 0)
, which would generate a store for every load (not to mention a fence)…
As long as the variable is volatile I think it would be safe to just read the value, but if I do that Visual Studio's code analysis feature emits a bunch of C28112 warnings ("A variable (foo) which is accessed via an Interlocked function must always be accessed via an Interlocked function.").
If a simple read is really the right way to go I think I could silence those with something like
#define atomic_load(object) \
__pragma(warning(push)) \
__pragma(warning(disable:28112)) \
(*(object)) \
__pragma(warning(pop))
But the analyzer's insistence that I should always be using the Interlocked*
functions leads me to believe there must be a better way. If that's the case, what is it?
Atomic operations are intended to allow access to shared data without extra protection (mutex, rwlock, …). This may improve: ● single thread performance ● scalability ● overall system performance.
Loads and Stores For that to be possible, such data must exist in shared memory or cache. Thus, an atomic load loads data from shared memory to either a register or thread-specific memory, depending on the processor architecture. Atomic stores move data into shared memory atomically.
The atomic type is implemented using mutex locks. If one thread acquires the mutex lock, then no other thread can acquire it until it is released by that particular thread.
I think ignoring the analyzer is acceptable here, given the documentation says simple reads of register width variables are safe (32 bit on 32 bit systems, 64 bit on 64 bit systems). The warning documentation itself basically says it's being overly cautious, even when the access might be safe.
That said, if you want to shut it up, you can always use an idempotent Interlocked
operation to get the desired behavior. For example, you could just define:
#define atomic_load(object) InterlockedOr((object), 0)
Since bitwise or with 0
is never going to change the value, and it always returned the original value, the end result is to read the original value while atomically writing nothing.
If you were simulating atomic_load_explicit
with memory_order_relaxed
you might get better performance by using InterlockedOrNoFence
to avoid memory barriers, but for simulating the default (sequentially consistent) atomic_load
you'd want to stick with InterlockedOr
.
InterlockedOr
was chosen mostly arbitrarily (on the theory that it might be slightly faster in hardware than an operation with carry like addition or subtraction), but InterlockedXor
with 0 would should behave the same way, as would several other operations, as long as they were done with their identity value.
You could also use InterlockedCompareExchange
in a similar manner; testing would be needed to determine which was faster:
#define atomic_load(object) InterlockedCompareExchange((object), 0, 0)
where again, if the value is already 0, you set it to zero, but all you're really using it for is to get the return value, the original value before the no-op exchange.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With