Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimization of fenced memory stores on x86 CPU

mov 0x0ff, 10
sfence 
mov 0x0ff, 12
sfence

Can it executed by x86-CPU as:

 mov 0x0ff, 12
 sfence

?

like image 903
Gilgamesz Avatar asked Mar 10 '18 21:03

Gilgamesz


1 Answers

Yes it is possible that some CPU could execute it as you propose.

Even if you put a stronger fence like mfence in there, or use locked instructions there is certainly no guarantee that the first write isn't optimized away.

This is true in general: the ordering and fencing rules basically tell you which executions are disallowed and hence guaranteed never to occur, but then considering complementary set of executions that are allowed to occur there is usually no guarantee that any particular execution can ever actually be observed.

That said, I'm pretty sure that on current x86 chips you'll always be able to observe the occasional 10 value (even if the fences are omitted entirely), despite any store buffer merging since you could occasionally get an interrupt between the two stores, allowing you to read 10.

Still, it's not guaranteed - one could certainly imagine a dynamically optimizing x86 architecture like Denver or Transmeta could condense the above sequence removing both fences and the first store, making 20 the only observable value.

like image 147
BeeOnRope Avatar answered Sep 18 '22 14:09

BeeOnRope