Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to control concurrent access to Java collections

Should I use old synchronized Vector collection, ArrayList with synchronized access or Collections.synchronizedList or some other solution for concurrent access?

I don't see my question in Related Questions nor in my search (Make your collections thread-safe? isn't the same).

Recently, I had to make kind of unit tests on GUI parts of our application (basically using API to create frames, add objects, etc.). Because these operations are called much faster than by a user, it shown a number of issues with methods trying to access resources not yet created or already deleted.

A particular issue, happening in the EDT, came from walking a linked list of views while altering it in another thread (getting a ConcurrentModificationException among other problems). Don't ask me why it was a linked list instead of a simple array list (even less as we have in general 0 or 1 view inside...), so I took the more common ArrayList in my question (as it has an older cousin).

Anyway, not super familiar with concurrency issues, I looked up a bit of info, and wondered what to choose between the old (and probably obsolete) Vector (which has synchronized operations by design), ArrayList with a synchronized (myList) { } around critical sections (add/remove/walk operations) or using a list returned by Collections.synchronizedList (not even sure how to use the latter).

I finally chose the second option, because another design mistake was to expose the object (getViewList() method...) instead of providing mechanisms to use it.

But what are the pros and cons of the other approaches?


[EDIT] Lot of good advices here, hard to select one. I will choose the more detailed and providing links/food for thoughts... :-) I like Darron's one too.

To summarize:

  • As I suspected, Vector (and its evil twin, Hashtable as well, probably) is largely obsolete, I have seen people telling its old design isn't as good as newer collections', beyond the slowness of synchronization forced even in single thread environment. If we keep it around, it is mostly because older libraries (and parts of Java API) still use it.
  • Unlike what I thought, Collections.synchronizedXxxx aren't more modern than Vector (they appear to be contemporary to Collections, ie. Java 1.2!) and not better, actually. Good to know. In short, I should avoid them as well.
  • Manual synchronization seems to be a good solution after all. There might be performance issues, but in my case it isn't critical: operations done on user actions, small collection, no frequent use.
  • java.util.concurrent package is worth keeping in mind, particularly the CopyOnWrite methods.

I hope I got it right... :-)

like image 454
PhiLho Avatar asked Feb 18 '09 15:02

PhiLho


People also ask

How do you resolve concurrency issues in Java?

The simplest way to avoid problems with concurrency is to share only immutable data between threads. Immutable data is data which cannot be changed. To make a class immutable define the class and all its fields as final. Also ensure that no reference to fields escape during construction.

What are the concurrent collections How do they achieve concurrency?

Concurrent collections (e.g. ConcurrentHashMap), achieve thread-safety by dividing their data into segments. In a ConcurrentHashMap, for example, different threads can acquire locks on each segment, so multiple threads can access the Map at the same time (a.k.a. concurrent access).

What is concurrent collections in Java?

concurrent package includes a number of additions to the Java Collections Framework. These are most easily categorized by the collection interfaces provided: BlockingQueue defines a first-in-first-out data structure that blocks or times out when you attempt to add to a full queue, or retrieve from an empty queue.


2 Answers

Vector and the List returned by Collections.synchronizedList() are morally the same thing. I would consider Vector to be effectively (but not actually) deprecated and always prefer a synchronized List instead. The one exception would be old APIs (particularly ones in the JDK) that require a Vector.

Using a naked ArrayList and synchronizing independently gives you the opportunity to more precisely tune your synchronization (either by including additional actions in the mutually exclusive block or by putting together multiple calls to the List in one atomic action). The down side is that it is possible to write code that accesses the naked ArrayList outside synchronization, which is broken.

Another option you might want to consider is a CopyOnWriteArrayList, which will give you thread safety as in Vector and synchronized ArrayList but also iterators that will not throw ConcurrentModificationException as they are working off of a non-live snapshot of the data.

You might find some of these recent blogs on these topics interesting:

  • Java Concurrency Bugs #3 - atomic + atomic != atomic
  • Java Concurrency Bugs #4: ConcurrentModificationException
  • CopyOnWriteArrayList concurrency fun
like image 117
Alex Miller Avatar answered Sep 21 '22 00:09

Alex Miller


I strongly recommend the book "Java Concurrency in Practice".

Each of the choices has advantages/disadvantages:

  1. Vector - considered "obsolete". It may get less attention and bug fixes than more mainstream collections.
  2. Your own synchronization blocks - Very easy to get incorrect. Often gives poorer performance than the choices below.
  3. Collections.synchronizedList() - Choice 2 done by experts. This is still not complete, because of multi-step operations that need to be atomic (get/modify/set or iteration).
  4. New classes from java.util.concurrent - Often have more efficient algorithms than choice 3. Similar caveats about multi-step operations apply but tools to help you are often provided.
like image 29
Darron Avatar answered Sep 23 '22 00:09

Darron