Java's memory model is based on "happens-before" relationship that enforces rules but also allows for optimization in the virtual machine's implementation in terms of cache invalidation. For example in the following case: <pre class="prettyprint"><code>// thread A private void method() { //code before lock synchronized (lockA) { //code inside } } // thread B private void method2() { //code before lock synchronized (lockA) { //code inside } } // thread B private void method3() { //code before lock synchronized (lockB) { //code inside } } </code></pre> if thread A calls <code>method()</code> and thread B tries to acquire <code>lockA</code> inside <code>method2()</code>, then the synchronization on <code>lockA</code> will require that thread B observes all changes that thread A made to all of its variables prior to releasing its lock, even the variables that were changed in the "code before lock" section. On the other hand, <code>method3()</code> uses another lock and doesn't enforce a happens-before relatation. This creates opportunity for optimization. My question is how does the virtual machine implements those complex semantics? Does it avoid a full flush of the cache when it is not needed? How does it track which variables did change by which thread at what point, so that it only loads from memory just the cache-lines needed?

You expect a too high-level thinking of a JVM. The memory model is intentionally only describing what has to be guaranteed, instead of how it has to be implemented. Certain architectures have coherent caches that don’t need to be flushed at all. Still, there might be actions required when it comes to forbid reordering of reads and/or writes beyond a certain point. But in all cases, these effects are global as the guarantees are made for all reads and writes, not depending on the particular construct which establishes the happens-before relationship. Recall, all writes happening before releasing a particular lock happen-before all reads after acquiring the same lock. The JVM doesn’t process happens-before relationships at all. It processes code, either by interpreting (executing) it or by generating native code for it. When doing so, it has to obey the memory model by inserting barriers or flushes and by not reordering read or write instructions beyond these barriers. At this point, it usually considers the code in isolation, not looking at what other threads are doing. The effect of these flushes or barriers is always global. However, having a global effect is not sufficient for establishing a happens-before relationship. This relationship only exists, when a thread is guaranteed to commit all writes before the other thread is guaranteed to (re-)read the values. This ordering does not exist, when two threads synchronize on different objects or acquire/release different locks. In case of <code>volatile</code> variables, you can evaluate the value of the variable to find out, whether the other thread has written the expected value and hence committed the writes. In case of a <code>synchronized</code> block, the mutual exclusion enforces an ordering. So within the <code>synchronized</code> block, a thread can examine all variables guarded by the monitor to evaluate the state, which should be the result of a previous update within a <code>synchronized</code> block using the same monitor. Since these effects are global, some developers were misguided into thinking that synchronizing on different locks was ok, as long as the assumption about a time ordering is “reasonable”, but such program code must be considered broken as it is relying on side effects of a particular implementation, especially its simplicity. One thing that recent JVMs do, is to consider that objects which are purely local, i.e. never seen by any other thread, can’t establish a happens-before relationship when synchronizing on them. Therefore, the effects of synchronization can be elided in these cases. We can expect more optimizations in the future…

How does a Java virtual machine implement the "happens-before" memory model?

Tags:

multithreading

memory-model

jvm

vm-implementation

happens-before

Java's memory model is based on "happens-before" relationship that enforces rules but also allows for optimization in the virtual machine's implementation in terms of cache invalidation.

For example in the following case:

// thread A
private void method() {
   //code before lock
   synchronized (lockA) {
       //code inside
   }
}

// thread B
private void method2() {
   //code before lock
   synchronized (lockA) {
       //code inside
   }
}

// thread B
private void method3() {
   //code before lock
   synchronized (lockB) {
       //code inside
   }
}

if thread A calls method() and thread B tries to acquire lockA inside method2(), then the synchronization on lockA will require that thread B observes all changes that thread A made to all of its variables prior to releasing its lock, even the variables that were changed in the "code before lock" section.

On the other hand, method3() uses another lock and doesn't enforce a happens-before relatation. This creates opportunity for optimization.

My question is how does the virtual machine implements those complex semantics? Does it avoid a full flush of the cache when it is not needed?

How does it track which variables did change by which thread at what point, so that it only loads from memory just the cache-lines needed?

704

asked Sep 29 '15 13:09

Petrakeas

2 Answers

You expect a too high-level thinking of a JVM. The memory model is intentionally only describing what has to be guaranteed, instead of how it has to be implemented. Certain architectures have coherent caches that don’t need to be flushed at all. Still, there might be actions required when it comes to forbid reordering of reads and/or writes beyond a certain point.

But in all cases, these effects are global as the guarantees are made for all reads and writes, not depending on the particular construct which establishes the happens-before relationship. Recall, all writes happening before releasing a particular lock happen-before all reads after acquiring the same lock.

The JVM doesn’t process happens-before relationships at all. It processes code, either by interpreting (executing) it or by generating native code for it. When doing so, it has to obey the memory model by inserting barriers or flushes and by not reordering read or write instructions beyond these barriers. At this point, it usually considers the code in isolation, not looking at what other threads are doing. The effect of these flushes or barriers is always global.

However, having a global effect is not sufficient for establishing a happens-before relationship. This relationship only exists, when a thread is guaranteed to commit all writes before the other thread is guaranteed to (re-)read the values. This ordering does not exist, when two threads synchronize on different objects or acquire/release different locks.

In case of volatile variables, you can evaluate the value of the variable to find out, whether the other thread has written the expected value and hence committed the writes. In case of a synchronized block, the mutual exclusion enforces an ordering. So within the synchronized block, a thread can examine all variables guarded by the monitor to evaluate the state, which should be the result of a previous update within a synchronized block using the same monitor.

Since these effects are global, some developers were misguided into thinking that synchronizing on different locks was ok, as long as the assumption about a time ordering is “reasonable”, but such program code must be considered broken as it is relying on side effects of a particular implementation, especially its simplicity.

One thing that recent JVMs do, is to consider that objects which are purely local, i.e. never seen by any other thread, can’t establish a happens-before relationship when synchronizing on them. Therefore, the effects of synchronization can be elided in these cases. We can expect more optimizations in the future…

173

answered Oct 24 '22 17:10

Holger

How does it track which variables did change by which thread at what point, so that it only loads from memory just the cache-lines needed?

No. That's not how modern CPUs work.

On every platform that you're likely to see multithreaded Java code running on that is complex enough to have this kind of issue, cache coherency is implemented in hardware. A cache line can be directly transferred from one cache to another without going through main memory. In fact, it would be awful if data had to pass through slow main memory every time it was put down on one core and picked up on another. So the caches communicate with each other directly.

When code modifies a memory address, the cache for that core acquires exclusive ownership of that memory address. If another core wants to read that memory address, the caches will typically share the memory address by direct communication. If either core wants to modify the shared data, it must invalidate the data in the other thread's cache.

So these caches are managed by hardware and effectively make themselves invisible at the software level.

However, CPUs do sometimes have prefetching or posted writes (not in cache yet). These simply require the use of memory barrier instructions. A memory barrier operates entirely inside the CPU to prevent reordering, delaying, or early execution of memory operations across the barrier. The CPU knows what memory operations are delayed or performed ahead of time, so code doesn't have to keep track of it.

answered Oct 24 '22 17:10

David Schwartz

Related questions
                            
                                How to calculate run-time for a multi-threaded program?
                            
                                C++, signal and threads
                            
                                how can more than one thread have "locked" on the same object (as shown in a thread-dump)
                            
                                Multi threading unit test
                            
                                Erlang spawning large amounts of C processes
                            
                                vtkRenderWindowInteractor event loop and threading
                            
                                When does std::future get executed?
                            
                                How to initialize all threads of a fixed thread pool before submitting any tasks? (JAVA)
                            
                                Trying to create a dialog in another thread wxpython
                            
                                How do multiple threads run on single core cpu
                            
                                Running external processes asynchronously in a windows service
                            
                                Java Multithreading priority: Why in this example, sometimes t1 occurs before t2 is completed, even if t2 has higher priority?
                            
                                EJB Pooling vs Thread-safe and @PreDestroy
                            
                                Creating new files concurrently [duplicate]
                            
                                Multithreading issue updating the value
                            
                                How can i lock a MUTEX for an element in the array, not for the complete array
                            
                                How to cancel ScheduledFuture task from another task and end gracefully?
                            
                                C++ Map Concurrent Insertion and Reading by Two threads
                            
                                Difference requiresMainQueueSetup and dispatch_get_main_queue?
                            
                                Java - Creating Multiple Threads with a For Loop

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With