As per this question's answer, it seems that LOCK CMPXCHG on x86 actually causes a full barrier. Presumably, this is what Unsafe.compareAndSwapInt()
generates under the hood as well. I am struggling to see why that is the case: with MESI protocol, after you updated the cache line, could the CPU simply invalidate just that cache line on other cores, rather than draining ALL store/load buffers of the core which performed CAS? Seems rather wasteful to me...
Your answer as far as I can see is in the comments - MESI updates caches, not Store/Load buffers
. But lock LOCK CMPXCHG
says: locked operations serialize all outstanding load and store operation
- this is why it needs to drain the Store/Load buffer from this CPU (and not others as detailed here).
So the current CPU has to perform the atomic operation on the most recent value - that could reside in Store/Load buffers, that's why a fence is needed there to actually drain that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With