I've gotten myself into a confused mess regarding multithreaded programming and was hoping someone could come and slap some understanding in me.
After doing quite a bit of reading, I've come to the understanding that I should be able to set the value of a 64 bit int atomically on a 64 bit system1.
I found a lot of this reading difficult though, so thought I would try to make a test to verify this. So I wrote a simple program with one thread which would set a variable into one of two values:
bool switcher = false;
while(true)
{
if (switcher)
foo = a;
else
foo = b;
switcher = !switcher;
}
And another thread which would check the value of foo
:
while (true)
{
__uint64_t blah = foo;
if ((blah != a) && (blah != b))
{
cout << "Not atomic! " << blah << endl;
}
}
I set a = 1844674407370955161;
and b = 1144644202170355111;
. I run this program and get no output warning me that blah
is not a
or b
.
Great, looks like it probably is an atomic write...but then, I changed the first thread to set a
and b
directly, like so:
bool switcher = false;
while(true)
{
if (switcher)
foo = 1844674407370955161;
else
foo = 1144644202170355111;
switcher = !switcher;
}
I re-run, and suddenly:
Not atomic! 1144644203261303193
Not atomic! 1844674406280007079
Not atomic! 1144644203261303193
Not atomic! 1844674406280007079
What's changed? Either way I'm assigning a large number to foo
- does the compiler handle a constant number differently, or have I misunderstood everything?
Thanks!
2: GCC Development list discussing that GCC doesn't guarantee it in the documentation, but the kernel and other programs rely on it
Disassembling the loop, I get the following code with gcc
:
.globl _switcher
_switcher:
LFB2:
pushq %rbp
LCFI0:
movq %rsp, %rbp
LCFI1:
movl $0, -4(%rbp)
L2:
cmpl $0, -4(%rbp)
je L3
movq _foo@GOTPCREL(%rip), %rax
movl $-1717986919, (%rax)
movl $429496729, 4(%rax)
jmp L5
L3:
movq _foo@GOTPCREL(%rip), %rax
movl $1486032295, (%rax)
movl $266508246, 4(%rax)
L5:
cmpl $0, -4(%rbp)
sete %al
movzbl %al, %eax
movl %eax, -4(%rbp)
jmp L2
LFE2:
So it would appear that gcc
does use to 32-bit movl
instruction with 32-bit immediate values. There is an instruction movq
that can move a 64-bit register to memory (or memory to a 64-bit register), but it does not seems to be able to set move an immediate value to a memory address, so the compiler is forced to either use a temporary register and then move the value to memory, or to use to movl
. You can try to force it to use a register by using a temporary variable, but this may not work.
References:
http://www.x86-64.org/documentation/assembly.html
immediate values inside instructions remain 32 bits.
There is no way for the compiler to do the assignation of a 64 bits constant atomically, excepted by first filling a register and then moving that register to the variable. That is probably more costly than assigning directly to the variable and as atomicity is not required by the language, the atomic solution is not chosen.
The Intel CPU documentation is right, aligned 8 Bytes read/writes are always atomic on recent hardware (even on 32 bit operating systems).
What you don't tell us, are you using a 64 bit hardware on a 32 bit system? If so, the 8 byte write will most likely be splitted into two 4 byte writes by the compiler.
Just have a look at the relevant section in the object code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With