It is known that on x86 for the operations load()
and store()
memory barriers memory_order_consume, memory_order_acquire, memory_order_release, memory_order_acq_rel
does not require a processor instructions for the cache and pipeline, and assembler's code always corresponds to std::memory_order_relaxed
, and these restrictions are necessary only for the optimization of the compiler: http://www.stdthread.co.uk/forum/index.php?topic=72.0
And this code Disassembly code confirms this for store()
(MSVS2012 x86_64):
std::atomic<int> a;
a.store(0, std::memory_order_relaxed);
000000013F931A0D mov dword ptr [a],0
a.store(1, std::memory_order_release);
000000013F931A15 mov dword ptr [a],1
But this code doesn't comfirm this for load()
(MSVS2012 x86_64), using lock cmpxchg
:
int val = a.load(std::memory_order_acquire);
000000013F931A1D prefetchw [a]
000000013F931A22 mov eax,dword ptr [a]
000000013F931A26 mov edx,eax
000000013F931A28 lock cmpxchg dword ptr [a],edx
000000013F931A2E jne main+36h (013F931A26h)
std::cout << val << "\n";
But Anthony Williams said:
some_atomic.load(std::memory_order_acquire) does just drop through to a simple load instruction, and some_atomic.store(std::memory_order_release) drops through to a simple store instruction.
Where am I wrong, and does the semantics of std::memory_order_acquire
requires processor instructions on x86/x86_64 lock cmpxchg
or only a simple load instruction mov
as said Anthony Williams?
ANSWER: It is the same as this bug report: http://connect.microsoft.com/VisualStudio/feedback/details/770885
No. The semantics of std::memory_order_acquire
doesn't requires processor instructions on x86/x86_64.
Any load()/store() operations on x86_64 doesn't require processor instructions (lock/fence) except atomic.store(val, std::memory_order_seq_cst);
which requires (LOCK) XCHG
or alternative: MOV (into memory),MFENCE
.
Processor memory-barriers-instructions for x86(except CAS), and also ARM and PowerPC: http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
Disassembler GCC 4.8.1 x86_64 - GDB - load():
20 temp = a.load(std::memory_order_relaxed);
21 temp = a.load(std::memory_order_acquire);
22 temp = a.load(std::memory_order_seq_cst);
0x46140b <+0x007b> mov 0x38(%rsp),%ebx
0x46140f <+0x007f> mov 0x34(%rsp),%esi
0x461413 <+0x0083> mov 0x30(%rsp),%edx
Disassembler GCC 4.8.1 x86_64 - GDB - store():
a.store(temp, std::memory_order_relaxed);
a.store(temp, std::memory_order_release);
a.store(temp, std::memory_order_seq_cst);
0x4613dc <+0x004c> mov %eax,0x20(%rsp)
0x4613e0 <+0x0050> mov 0x38(%rsp),%eax
0x4613e4 <+0x0054> mov %eax,0x20(%rsp)
0x4613e8 <+0x0058> mov 0x38(%rsp),%eax
0x4613ec <+0x005c> mov %eax,0x20(%rsp)
0x4613f0 <+0x0060> mfence
0x4613f3 <+0x0063> mov %ebx,0x20(%rsp)
Disassembler MSVS 2012 x86_64 - load() - it is the same as this bug report: http://connect.microsoft.com/VisualStudio/feedback/details/770885:
temp = a.load(std::memory_order_relaxed);
000000013FE51A1F prefetchw [a]
000000013FE51A24 mov eax,dword ptr [a]
000000013FE51A28 nop dword ptr [rax+rax]
000000013FE51A30 mov ecx,eax
000000013FE51A32 lock cmpxchg dword ptr [a],ecx
000000013FE51A38 jne main+40h (013FE51A30h)
000000013FE51A3A mov dword ptr [temp],eax
temp = a.load(std::memory_order_acquire);
000000013FE51A3E prefetchw [a]
000000013FE51A43 mov eax,dword ptr [a]
000000013FE51A47 nop word ptr [rax+rax]
000000013FE51A50 mov ecx,eax
000000013FE51A52 lock cmpxchg dword ptr [a],ecx
000000013FE51A58 jne main+60h (013FE51A50h)
000000013FE51A5A mov dword ptr [temp],eax
temp = a.load(std::memory_order_seq_cst);
000000013FE51A5E prefetchw [a]
temp = a.load(std::memory_order_seq_cst);
000000013FE51A63 mov eax,dword ptr [a]
000000013FE51A67 nop word ptr [rax+rax]
000000013FE51A70 mov ecx,eax
000000013FE51A72 lock cmpxchg dword ptr [a],ecx
000000013FE51A78 jne main+80h (013FE51A70h)
000000013FE51A7A mov dword ptr [temp],eax
Disassembler MSVS 2012 x86_64 - store():
a.store(temp, std::memory_order_relaxed);
000000013F8C1A58 mov eax,dword ptr [temp]
000000013F8C1A5C mov dword ptr [a],eax
a.store(temp, std::memory_order_release);
000000013F8C1A60 mov eax,dword ptr [temp]
000000013F8C1A64 mov dword ptr [a],eax
a.store(temp, std::memory_order_seq_cst);
000000013F8C1A68 mov eax,dword ptr [temp]
000000013F8C1A6C xchg eax,dword ptr [a]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With