Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java Concurrency Incrementing a Value

I have been reading about volatile and synchronized in Java but have been scratching my head in confusion. I am hoping someone can help me clear up a problem

private HashMap<String,int> map = new HashMap<String,int>();

In my thread

if (map.get("value") == null)
{
map.put("value",0);
}
map.put("value",map.get("value")+1);

My goal is to have all the threads share this map. If I add volatile it doesn't seem to fix the problem for me (I output and see that map is being override to each time). I then tried using ConcurrentHashMap and adding volatile in front of that... that also didn't seem to work. Based on my reading about volatile my understanding is that it should "lock" access to the map when map is being written to and then when map is done being written to the lock is released.

So... then I tried adding static

private static ConcurrentHashMap<String,int> map = new ConcurrentHashMap<String,int>();

and that seems to work perfectly... But... I keep reading that using static isn't the right way due to something about 'contention' (which I don't quite understand)

Thanks in advance

like image 738
K2xL Avatar asked Dec 28 '22 09:12

K2xL


1 Answers

Volatile won't help here. Volatile is useful to solve visibility problems, but you are facing another problem: atomicity.

Oh, and volatile has absolutely nothing to do with locking. It won't acquire a lock when reading/writing, it won't release anything ever. What it does is this: all the actions that happened-before a write to a volatile field will be visible to every other thread, after they read the same volatile field. There's no locking involved (they are similar in that the memory effects of releasing/acquiring a lock are exactly the same).

The operations get and set are not atomic, meaning that other things may happen between the two.

For instance, one thread will get the value, then ANOTHER thread will get the same value, both will increment the value, then the first will set the new value, then the second will do the same. The final result is not what you expected.

The most common solution to this problem is to serialize access (ie, synchronize) to the shared variable, or to use compare-and-set (CAS) (so you won't need to do synchronization).

1. synchronized

private final Map<String, Integer> m = new ConcurrentHashMap<String, Integer>();
synchronized incrementValue(final String valueName) {
  m.put(valueName, m.get(valueName) + 1);
}

Note that if you use this solution then EVERY ACCESS to the map must synchronize on the same lock.

2. CAS

Many CAS algorithms are already implemented in the JVM, in a very performatic way (ie, they use native code, and the JIT may use instructions specific to the processor, that you cannot access in other ways -- check the class Unsafe in Sun's JVM for example).

One class that might be useful to you here is AtomicInteger. You can use it like this:

private final Map<String, AtomicInteger> m = new ConcurrentHashMap<String, AtomicInteger>();
incrementValue(final String valueName) {
  m.get(valueName).incrementAndGet();
}

What a CAS algorithm will do is something like this:

for (;;) {
  state = object.getCurrentState();
  if (object.updateValueAndStateIfStateDidntChange(state)) {
    break;
  }
}

It is assumed that the method updateValueAndStateIfStateDidntChange is atomic and will return true only if it was able to update the value. This way, if another thread modifies the value after you get the state and before you update the value, the method will return false and the loop will try it again.

Assuming you can implement that method in such a way that won't use synchronized (and you can, through the use of classes in java.util.concurrent), you will avoid contention (which means threads waiting to obtain locks held by another threads) and you may see a general improvement in performance.

I use a lot this kind of thing in distributed task execution system I wrote. The tasks must all be executed exactly once, and I have lots of machines executing tasks. The tasks are all specified in a single MySQL table. How to do it? You must have a column which purpose is to allow the implementation of CAS. Call it executing. Before starting the task, you must do something like: retrieve the next task, "update tasks set executing = 1 where id = :id AND executing = 0" and count the number of updated lines. If you updated 0 lines, it is because another thread/process/machine has already taken that task (and successfully executed that "update" query); in this case, you forget it and try the next task, because you know that this one is already being executed. If you updated 1 line, then it is good to go, you can execute it.

Another place where I use a lot this idea of CAS is in a very dynamic (in respect to its configuration) resource pool I wrote (I use it mostly to manage "connections", ie, sockets, but it is sufficiently generic to hold any kind of resource). Basically, it counts how many resources it is holding. When you try to acquire a resource, it reads the counter, decrements it, tries to update it (if nothing else modified the counter in between), and if this is successful, then you can simply take a resource from the pool and lend it (once the counter reaches 0, it won't lend a resource). If I ever publish this code, I will be certain to add a link to it here.

like image 137
Bruno Reis Avatar answered Jan 11 '23 01:01

Bruno Reis