I was studying re-entrancy in programming. On this site of IBM (really good one). I have founded a code, copied below. It's the first code that comes rolling down the website.
The code tries showing the issues involving shared access to variable in a non linear development of a text program (asynchronicity) by printing two values that constantly change in a "dangerous context".
#include <signal.h>
#include <stdio.h>
struct two_int { int a, b; } data;
void signal_handler(int signum){
printf ("%d, %d\n", data.a, data.b);
alarm (1);
}
int main (void){
static struct two_int zeros = { 0, 0 }, ones = { 1, 1 };
signal (SIGALRM, signal_handler);
data = zeros;
alarm (1);
while (1){
data = zeros;
data = ones;
}
}
The problems appeared when I tried to run the code (or better, didn't appear). I was using gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1) in default configuration. The misguided output doesn't occurs. The frequency in getting "wrong" pair values is 0!
What is going on after all? Why there is no problem in re-entrancy using static global variables?
Looking at the godbolt compiler explorer (after adding in the missing #include <unistd.h>
), one sees that for almost any x86_64 compiler the code generated uses QWORD moves to load the ones
and zeros
in a single instruction.
mov rax, QWORD PTR main::ones[rip]
mov QWORD PTR data[rip], rax
The IBM site says On most machines, it takes several instructions to store a new value in data, and the value is stored one word at a time.
which might have been true for typical cpus in 2005 but as the code shows is not true now. Changing the struct to have two longs rather than two ints would show the issue.
I previously wrote that this was "atomic" which was lazy. The program is only running on a single cpu. Each instruction will complete from the point of view of this cpu (assuming there is nothing else altering the memory such as dma).
So at the C
level it is not defined that the compiler will chose a single instruction to write the struct, and so the corruption mentioned in the IBM paper can happen. Modern compilers targeting current cpus do use a single instruction. A single instruction is good enough to avoid corruption for a single threaded program.
That's not really re-entrancy; you're not running a function twice in the same thread (or in different threads). You can get that via recursion or passing the address of the current function as a callback function-pointer arg to another function. (And it wouldn't be unsafe because it would be synchronous).
This is just plain vanilla data-race UB (Undefined Behaviour) between a signal handler and the main thread: only sig_atomic_t
is guaranteed safe for this. Others may happen to work, like in your case where an 8-byte object can be loaded or stored with one instruction on x86-64, and the compiler happens to choose that asm. (As @icarus's answer shows).
See MCU programming - C++ O2 optimization breaks while loop - an interrupt handler on a single-core microcontroller is basically the same thing as a signal handler in a single threaded program. In that case the result of the UB is that a load got hoisted out of a loop.
Your test-case of tearing actually happening because of data-race UB was probably developed / tested in 32-bit mode, or with an older dumber compiler that loaded the struct members separately.
In your case, the compiler can optimize the stores out from the infinite loop because no UB-free program could ever observe them. data
is not _Atomic
or volatile
, and there are no other side-effects in the loop.
So there's no way any reader could synchronize with this writer. This in fact happens if you compile with optimization enabled (Godbolt shows an empty loop at the bottom of main). I also changed the struct to two long long
, and gcc uses a single movdqa
16-byte store before the loop. (This is not guaranteed atomic, but it is in practice on almost all CPUs, assuming it's aligned, or on Intel merely doesn't cross a cache-line boundary. Why is integer assignment on a naturally aligned variable atomic on x86?)
So compiling with optimization enabled would also break your test, and show you the same value every time. C is not a portable assembly language.
volatile struct two_int
would also force the compiler not to optimize them away, but would not force it to load/store the whole struct atomically. (It wouldn't stop it from doing so either, though.) Note that volatile
does not avoid data-race UB, but in practice it's sufficient for inter-thread communication and was how people built hand-rolled atomics (along with inline asm) before C11 / C++11, for normal CPU architectures. They're cache-coherent so volatile
is in practice mostly similar to _Atomic
with memory_order_relaxed
for pure-load and pure-store, if used for types narrow enough that the compiler will use a single instruction so you don't get tearing. And of course volatile
doesn't have any guarantees from the ISO C standard vs. writing code that compiles to the same asm using _Atomic
and mo_relaxed.
If you had a function that did global_var++;
on an int
or long long
that you run from main and asynchronously from a signal handler, that would be a way to use re-entrancy to create data-race UB.
Depending on how it compiled (to a memory destination inc or add, or to separate load/inc/store) it would be atomic or not with respect to signal handlers in the same thread. See Can num++ be atomic for 'int num'? for more about atomicity on x86 and in C++. (C11's stdatomic.h
and _Atomic
attribute provides equivalent functionality to C++11's std::atomic<T>
template)
An interrupt or other exception can't happen in the middle of an instruction, so a memory-destination add is atomic wrt. context switches on a single-core CPU. Only a (cache coherent) DMA writer could "step on" an increment from a add [mem], 1
without a lock
prefix on a single-core CPU. There aren't any other cores that another thread could be running on.
So it's similar to the case of signals: a signal handler runs instead of the normal execution of the thread handling the signal, so it can't be handled in the middle of one instruction.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With