Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java thread safety of list

I have a List, which is to be used either in thread-safe context or in a non thread-safe one. Which one will it be, is impossible to determine in advance.

In such special case, whenever the list enters non-thread-safe context, I wrap it using

Collections.synchronizedList(...)

But I don't want to wrap it, if doesn't enter non-thread-safe context. F.e., because list is huge and used intensively.

I've read about Java, that its optimization policy is strict about multi-threading - if you don't synchronize your code correctly, it is not guaranteed to be executed correctly in inter-thread context - it can reorganize code significantly, providing consistency in context of only one thread (see http://java.sun.com/docs/books/jls/third_edition/html/memory.html#17.3). F.e.,

op1; op2; op3;

may be reorganized in

op3; op2; op1;

, if it produces same result (in one-thread context).

Now I wonder, if I

  1. fill my list before wrapping it by synchronizedList,

  2. then wrap it up,

  3. then use by different thread

, - is there a possibility, that different thread will see this list filled only partially or not filled at all? Might JVM postpone (1) till after (3)? Is there a correct and fast way to make (big) List being non-thread-safe to become thread-safe?

like image 670
Andrey Avatar asked Feb 07 '11 14:02

Andrey


2 Answers

When you give your list to another thread by thread-safe means (for example using a synchronized block, a volatile variable or an AtomicReference), it is guaranteed that the second thread sees the whole list in the state it was when transferring (or any later state, but not an earlier state).

If you don't change it afterwards, you also don't need your synchronizedList.


Edit (after some comments, to backup my claim):

I assume the following:

  • we have a volatile variable list.

    volatile List<String> list = null;
    
  • Thread A:

    1. creates a List L and fills L with elements.
    2. sets list to point to L (this means writes L to list)
    3. does no further modifications on L.

    Sample source:

    public void threadA() {
       List<String> L = new ArrayList<String>();
       L.add("Hello");
       L.add("World");
       list = l;
    }
    
  • Thread B:

    1. reads K from list
    2. iterates over K, printing the elements.

    Sample source:

    public void threadB() {
         List<String> K = list;
         for(String s : K) {
             System.out.println(s);
         }
    }
    
  • All other threads do not touch the list.

Now we have this:

  • The actions 1-A and 2-A in Thread A are ordered by program order so 1 comes before 2.
  • The action 1-B and 2-B in Thread B are ordered by program order so 1 comes before 2.
  • The action 2-A in Thread A and action 1-B in Thread are ordered by synchronization order, so 2-A comes before 1-B, since

    A write to a volatile variable (§8.3.1.4) v synchronizes-with all subsequent reads of v by any thread (where subsequent is defined according to the synchronization order).

  • The happens-before-order is the transitive closure of the program orders of the individual threads and the synchronization order. So we have:

    1-A happens-before 2-A happens-before 1-B happens-before 2-B

    and thus 1-A happens-before 2-B.

  • Finally,

    If one action happens-before another, then the first is visible to and ordered before the second.

So our iterating thread really can see the whole list, and not only some parts of it. So, transmitting the list with a single volatile variable is sufficient, and we don't need synchronization in this simple case.


One more edit (here, since I have more formatting freedom than in the comments) about the program order of Thread A. (I also added some sample Code above.)

From the JLS (section program order):

Among all the inter-thread actions performed by each thread t, the program order of t is a total order that reflects the order in which these actions would be performed according to the intra-thread semantics of t.

So, what are the intra-thread semantics of thread A?

Some paragraphs above:

The memory model determines what values can be read at every point in the program. The actions of each thread in isolation must behave as governed by the semantics of that thread, with the exception that the values seen by each read are determined by the memory model. When we refer to this, we say that the program obeys intra-thread semantics. Intra-thread semantics are the semantics for single threaded programs, and allow the complete prediction of the behavior of a thread based on the values seen by read actions within the thread. To determine if the actions of thread t in an execution are legal, we simply evaluate the implementation of thread t as it would be performed in a single threaded context, as defined in the rest of this specification.

The rest of this specification includes section 14.2 (Blocks):

A block is executed by executing each of the local variable declaration statements and other statements in order from first to last (left to right).

So, the program order is indeed the order in which the statements/expressions are given in the program source code.

Thus, in our example source, the memory actions create a new ArrayList, add "Hello", add "World", and assign to list (the first three consist of more subactions) indeed are in this program order.

(The VM does not have to execute the actions in this order, but this program order still contributes to the happens-before order, and thus to the visibility to other threads.)

like image 62
Paŭlo Ebermann Avatar answered Oct 17 '22 16:10

Paŭlo Ebermann


If you fill your list and then wrap it in the same thread, you'll be safe.

However there are several things to bear in mind:

  1. Collections.synchronizedList() only guarantees you a low-level thread safety. Complex operations, like if ( !list.contains( elem ) ) list.add( elem ); will still need custom synchronization code.
  2. Even this guarantee is void if any thread can obtain a reference to the original list. Make sure this doesn't happen.
  3. Get the functionality right first, then you can start worrying about synchronization being too slow. I very rarely encountered code where the speed of Java synchronization was a serious factor.

Update: I'd like to add a few excerpts from the JLS to hopefully clarify matters a bit.

If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).

This is why filling the list and then wrapping it in the same thread is a safe option. But more importantly:

This is an extremely strong guarantee for programmers. Programmers do not need to reason about reorderings to determine that their code contains data races. Therefore they do not need to reason about reorderings when determining whether their code is correctly synchronized. Once the determination that the code is correctly synchronized is made, the programmer does not need to worry that reorderings will affect his or her code.

The message is clear: make sure that your program, executed in the order which you wrote your code in, doesn't contain data races, and don't worry about reordering.

like image 21
biziclop Avatar answered Oct 17 '22 16:10

biziclop