mov 0x0ff, 10
sfence
mov 0x0ff, 12
sfence
Can it executed by x86-CPU as:
mov 0x0ff, 12
sfence
?
Yes it is possible that some CPU could execute it as you propose.
Even if you put a stronger fence like mfence
in there, or use locked instructions there is certainly no guarantee that the first write isn't optimized away.
This is true in general: the ordering and fencing rules basically tell you which executions are disallowed and hence guaranteed never to occur, but then considering complementary set of executions that are allowed to occur there is usually no guarantee that any particular execution can ever actually be observed.
That said, I'm pretty sure that on current x86 chips you'll always be able to observe the occasional 10 value (even if the fences are omitted entirely), despite any store buffer merging since you could occasionally get an interrupt between the two stores, allowing you to read 10.
Still, it's not guaranteed - one could certainly imagine a dynamically optimizing x86 architecture like Denver or Transmeta could condense the above sequence removing both fences and the first store, making 20 the only observable value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With