Java memory model - volatile and x86

Question

I am trying to understand the intrinsics of java volatile and its semantics, and its transaltion to the underlying architecture and its instructions. If we consider the following blogs and resourses

fences generated for volatile, What gets generated for read/write of volatile and Stack overflow question on fences

here is what I gather:

volatile read inserts loadStore/LoadLoad barriers after it (LFENCE instruction on x86)
It prevents the reordering of loads with subsequent writes/loads
It is supposed to guarantee loading of the global state that was modified by other threads i.e. after LFENCE the state modifications done by other threads are visible to the current thread on its CPU.

WHat I am struggling to understand is this: Java does not emit LFENCE on x86 i.e. read of volatile does not cause LFENCE.... I know that memory ordering of x86 prevent reording of loads with lods/stored, so second bullet point is taken care of. However, I would assume that in order for the state to be visible by this thread, LFENCE instruction should be issued to guarantee that all LOAD buffers are drained before the next instruction after the fence is executed (as per Intel manual). I understand there is cahce coherence protocol on x86, but volatile read should still drain any LOADs in the buffers, no?

David Schwartz · Accepted Answer

On x86, the buffers are pinned to the cache line. If the cache line is lost, the value in the buffer isn't used. So there's no need to fence or drain the buffers; the value they contain must be current because another core can't modify the data without first invalidating the cache line.

pveentjer · Answer

The X86 provides TSO. So, on a hardware level, the following barriers you get for free [LoadLoad][LoadStore][StoreStore]. The only one missing is the [StoreLoad].

A load has acquire semantics

r1=X
[LoadLoad]
[LoadStore]

A store has release semantics

[LoadStore]
[StoreStore]
Y=r2

If you would do a store followed by a load you end up with this:

[LoadStore]
[StoreStore]
Y=r2
r1=X
[LoadLoad]
[LoadStore]

The issue is that the load and store can still be reordered and hence it isn't sequential consistent; and this is mandatory for the Java Memory model. They only way to prevent this is with a [StoreLoad].

[LoadStore]
[StoreStore]
Y=r2
[StoreLoad]
r1=X
[LoadLoad]
[LoadStore]

And the most logical place would be to add it to the write since normally reads are more frequent than writes. So the write would become:

[LoadStore]
[StoreStore]
Y=r2
[StoreLoad]

Because the X86 provides TSO, the following fences can be no-ops:

[LoadLoad][LoadStore][StoreStore]

So the only one relevant is the [StoreLoad] and this can be accomplished by an MFENCE or a lock addl %(RSP),0

The LFENCE and the SFENCE are not relevant for this situation. The LFENCE and SFENCE are for weakly ordered loads and stores (e.g. those of SSE).

What the [StoreLoad] does on the X86 is to stop executing loads, till the store buffer has been drained. This will make sure that the load is globally visible (so read from memory/cache) AFTER the store has become globally visible (has left the store buffer and entered the L1d).

Java memory model - volatile and x86

Tags:

java

multithreading

volatile

cpu

java-memory-model

Bober02

2 Answers

David Schwartz

pveentjer

Recent Activity

Donate For Us

Java memory model - volatile and x86

Tags:

java

multithreading

volatile

cpu

java-memory-model

Bober02

2 Answers

David Schwartz

pveentjer

Related questions

Recent Activity

Donate For Us