Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how synchronized keyword works internally

I read the below program and answer in a blog.

int x = 0;
boolean bExit = false;

Thread 1 (not synchronized)

x = 1; 
bExit = true;

Thread 2 (not synchronized)

if (bExit == true) 
System.out.println("x=" + x);

is it possible for Thread 2 to print “x=0”?
Ans : Yes ( reason : Every thread has their own copy of variables. )

how do you fix it?
Ans: By using make both threads synchronized on a common mutex or make both variable volatile.

My doubt is : If we are making the 2 variable as volatile then the 2 threads will share the variables from the main memory. This make a sense, but in case of synchronization how it will be resolved as both the thread have their own copy of variables.

Please help me.

like image 440
java Avatar asked Dec 18 '22 20:12

java


1 Answers

This is actually more complicated than it seems. There are several arcane things at work.

Caching

Saying "Every thread has their own copy of variables" is not exactly correct. Every thread may have their own copy of variables, and they may or may not flush these variables into the shared memory and/or read them from there, so the whole thing is non-deterministic. Moreover, the very term flushing is really implementation-dependent. There are strict terms such as memory consistency, happens-before order, and synchronization order.

Reordering

This one is even more arcane. This

x = 1; 
bExit = true;

does not even guarantee that Thread 1 will first write 1 to x and then true to bExit. In fact, it does not even guarantee that any of these will happen at all. The compiler may optimize away some values if they are not used later. The compiler and CPU are also allowed to reorder instructions any way they want, provided that the outcome is indistinguishable from what would happen if everything was really in program order. That is, indistinguishable for the current thread! Nobody cares about other threads until...

Synchronization comes in

Synchronization does not only mean exclusive access to resources. It is also not just about preventing threads from interfering with each other. It's also about memory barriers. It can be roughly described as each synchronization block having invisible instructions at the entry and exit, the first one saying "read everything from the shared memory to be as up-to-date as possible" and the last one saying "now flush whatever you've been doing there to the shared memory". I say "roughly" because, again, the whole thing is an implementation detail. Memory barriers also restrict reordering: actions may still be reordered, but the results that appear in the shared memory after exiting the synchronized block must be identical to what would happen if everything was indeed in program order.

All that only works, of course, only if both blocks use the same locking object.

The whole thing is described in details in Chapter 17 of the JLS. In particular, what's important is the so-called "happens-before order". If you ever see in the documentation that "this happens-before that", it means that everything the first thread does before "this" will be visible to whoever does "that". This may even not require any locking. Concurrent collections are a good example: one thread puts there something, another one reads that, and that magically guarantees that the second thread will see everything the first thread did before putting that object into the collection, even if those actions had nothing to do with the collection itself!

Volatile variables

One last warning: you better give up on the idea that making variables volatile will solve things. In this case maybe making bExit volatile will suffice, but there are so many troubles that using volatiles can lead to that I'm not even willing to go into that. But one thing is for sure: using synchronized has much stronger effect than using volatile, and that goes for memory effects too. What's worse, volatile semantics changed in some Java version so there may exist some versions that still use the old semantics which was even more obscure and confusing, whereas synchronized always worked well provided you understand what it is and how to use it.

Pretty much the only reason to use volatile is performance because synchronized may cause lock contention and other troubles. Read Java Concurrency in Practice to figure all that out.

Q & A

1) You wrote "now flush whatever you've been doing there to the shared memory" about synchronized blocks. But we will see only the variables that we access in the synchronize block or all the changes that the thread call synchronize made (even on the variables not accessed in the synchronized block)?

Short answer: it will "flush" all variables that were updated during the synchronized block or before entering the synchronized block. And again, because flushing is an implementation detail, you don't even know whether it will actually flush something or do something entirely different (or doesn't do anything at all because the implementation and the specific situation already somehow guarantee that it will work).

Variables that wasn't accessed inside the synchronized block obviously won't change during the execution of the block. However, if you change some of those variables before entering the synchronized block, for example, then you have a happens-before relationship between those changes and whatever happens in the synchronized block (the first bullet in 17.4.5). If some other thread enters another synchronized block using the same lock object then it synchronizes-with the first thread exiting the synchronized block, which means that you have another happens-before relationship here. So in this case the second thread will see the variables that the first thread updated prior to entering the synchronized block.

If the second thread tries to read those variables without synchronizing on the same lock, then it is not guaranteed to see the updates. But then again, it isn't guaranteed to see the updates made inside the synchronized block as well. But this is because of the lack of the memory-read barrier in the second thread, not because the first one didn't "flush" its variables (memory-write barrier).

2) In this chapter you post (of JLS) it is written that: "A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field." Doesn't this mean that when the variable is volatile you will see only changes of it (because it is written write happens-before read, not happens-before every operation between them!). I mean doesn't this mean that in the example, given in the description of the problem, we can see bExit = true, but x = 0 in the second thread if only bExit is volatile? I ask, because I find this question here: http://java67.blogspot.bg/2012/09/top-10-tricky-java-interview-questions-answers.html and it is written that if bExit is volatile the program is OK. So the registers will flush only bExits value only or bExits and x values?

By the same reasoning as in Q1, if you do bExit = true after x = 1, then there is an in-thread happens-before relationship because of the program order. Now since volatile writes happen-before volatile reads, it is guaranteed that the second thread will see whatever the first thread updated prior to writing true to bExit. Note that this behavior is only since Java 1.5 or so, so older or buggy implementations may or may not support this. I have seen bits in the standard Oracle implementation that use this feature (java.concurrent collections), so you can at least assume that it works there.

3) Why monitor matters when using synchronized blocks about memory visibility? I mean when try to exit synchronized block aren't all variables (which we accessed in this block or all variables in the thread - this is related to the first question) flushed from registers to main memory or broadcasted to all CPU caches? Why object of synchronization matters? I just cannot imagine what are relations and how they are made (between object of synchronization and memory). I know that we should use the same monitor to see this changes, but I don't understand how memory that should be visible is mapped to objects. Sorry, for the long questions, but these are really interesting questions for me and it is related to the question (I would post questions exactly for this primer).

Ha, this one is really interesting. I don't know. Probably it flushes anyway, but Java specification is written with high abstraction in mind, so maybe it allows for some really weird hardware where partial flushes or other kinds of memory barriers are possible. Suppose you have a two-CPU machine with 2 cores on each CPU. Each CPU has some local cache for every core and also a common cache. A really smart VM may want to schedule two threads on one CPU and two threads on another one. Each pair of the threads uses its own monitor, and VM detects that variables modified by these two threads are not used in any other threads, so it only flushes them as far as the CPU-local cache.

See also this question about the same issue.

4) I thought that everything before writing a volatile will be up to date when we read it (moreover when we use volatile a read that in Java it is memory barrier), but the documentation don't say this.

It does:

17.4.5. If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).

If hb(x, y) and hb(y, z), then hb(x, z).

A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.

If x = 1 comes before bExit = true in program order, then we have happens-before between them. If some other thread reads bExit after that, then we have happens-before between write and read. And because of the transitivity, we also have happens-before between x = 1 and read of bExit by the second thread.

5) Also, if we have volatile Person p does we have some dependency when we use p.age = 20 and print(p.age) or have we memory barrier in this case(assume age is not volatile) ? - I think - No

You are correct. Since age is not volatile, then there is no memory barrier, and that's one of the trickiest things. Here is a fragment from CopyOnWriteArrayList, for example:

        Object[] elements = getArray();
        E oldValue = get(elements, index);
        if (oldValue != element) {
            int len = elements.length;
            Object[] newElements = Arrays.copyOf(elements, len);
            newElements[index] = element;
            setArray(newElements);
        } else {
            // Not quite a no-op; ensures volatile write semantics
            setArray(elements);

Here, getArray and setArray are trivial setter and getter for the array field. But since the code changes elements of the array, it is necessary to write the reference to the array back to where it came from in order for the changes to the elements of the array to become visible. Note that it is done even if the element being replaced is the same element that was there in the first place! It is precisely because some fields of that element may have changed by the calling thread, and it's necessary to propagate these changes to future readers.

6) And is there any happens before 2 subsequent reads of volatile field? I mean does the second read will see all changes from thread which reads this field before it(of course we will have changes only if volatile influence visibility of all changes before it - which I am a little confused whether it is true or not)?

No, there is no relationship between volatile reads. Of course, if one thread performs a volatile write and then two other thread perform volatile reads, they are guaranteed to see everything at least up to date as it was before the volatile write, but there is no guarantee of whether one thread will see more up-to-date values than the other. Moreover, there is not even strict definition of one volatile read happening before another! It is wrong to think of everything happening on a single global timeline. It is more like parallel universes with independent timelines that sometimes sync their clocks by performing synchronization and exchanging data with memory barriers.

like image 91
Sergei Tachenov Avatar answered Dec 21 '22 10:12

Sergei Tachenov