Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java memory model - volatile and x86

I am trying to understand the intrinsics of java volatile and its semantics, and its transaltion to the underlying architecture and its instructions. If we consider the following blogs and resourses

fences generated for volatile, What gets generated for read/write of volatile and Stack overflow question on fences

here is what I gather:

  • volatile read inserts loadStore/LoadLoad barriers after it (LFENCE instruction on x86)
  • It prevents the reordering of loads with subsequent writes/loads
  • It is supposed to guarantee loading of the global state that was modified by other threads i.e. after LFENCE the state modifications done by other threads are visible to the current thread on its CPU.

WHat I am struggling to understand is this: Java does not emit LFENCE on x86 i.e. read of volatile does not cause LFENCE.... I know that memory ordering of x86 prevent reording of loads with lods/stored, so second bullet point is taken care of. However, I would assume that in order for the state to be visible by this thread, LFENCE instruction should be issued to guarantee that all LOAD buffers are drained before the next instruction after the fence is executed (as per Intel manual). I understand there is cahce coherence protocol on x86, but volatile read should still drain any LOADs in the buffers, no?

like image 606
Bober02 Avatar asked Apr 27 '17 21:04

Bober02


2 Answers

On x86, the buffers are pinned to the cache line. If the cache line is lost, the value in the buffer isn't used. So there's no need to fence or drain the buffers; the value they contain must be current because another core can't modify the data without first invalidating the cache line.

like image 133
David Schwartz Avatar answered Nov 16 '22 04:11

David Schwartz


The X86 provides TSO. So, on a hardware level, the following barriers you get for free [LoadLoad][LoadStore][StoreStore]. The only one missing is the [StoreLoad].

A load has acquire semantics

r1=X
[LoadLoad]
[LoadStore]

A store has release semantics

[LoadStore]
[StoreStore]
Y=r2

If you would do a store followed by a load you end up with this:

[LoadStore]
[StoreStore]
Y=r2
r1=X
[LoadLoad]
[LoadStore]

The issue is that the load and store can still be reordered and hence it isn't sequential consistent; and this is mandatory for the Java Memory model. They only way to prevent this is with a [StoreLoad].

[LoadStore]
[StoreStore]
Y=r2
[StoreLoad]
r1=X
[LoadLoad]
[LoadStore]

And the most logical place would be to add it to the write since normally reads are more frequent than writes. So the write would become:

[LoadStore]
[StoreStore]
Y=r2
[StoreLoad]

Because the X86 provides TSO, the following fences can be no-ops:

[LoadLoad][LoadStore][StoreStore]

So the only one relevant is the [StoreLoad] and this can be accomplished by an MFENCE or a lock addl %(RSP),0

The LFENCE and the SFENCE are not relevant for this situation. The LFENCE and SFENCE are for weakly ordered loads and stores (e.g. those of SSE).

What the [StoreLoad] does on the X86 is to stop executing loads, till the store buffer has been drained. This will make sure that the load is globally visible (so read from memory/cache) AFTER the store has become globally visible (has left the store buffer and entered the L1d).

like image 34
pveentjer Avatar answered Nov 16 '22 03:11

pveentjer