Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to guarantee get() of ConcurrentHashMap to always return the latest actual value?

Introduction
Suppose I have a ConcurrentHashMap singleton:

public class RecordsMapSingleton {

    private static final ConcurrentHashMap<String,Record> payments = new ConcurrentHashMap<>();

    public static ConcurrentHashMap<String, Record> getInstance() {
        return payments;
    }

}

Then I have three subsequent requests (all processed by different threads) from different sources.
The first service makes a request, that gets the singleton, creates Record instance, generates unique ID and places it into Map, then sends this ID to another service.
Then the second service makes another request, with that ID. It gets the singleton, finds Record instance and modifies it.
Finally (probably after half an hour) the second service makes another request, in order to modify Record further.

Problem
In some really rare cases, I'm experiencing heisenbug. In logs I can see, that first request successfully placed Record into Map, second request found it by ID and modified it, and then third request tried to find Record by ID, but found nothing (get() returned null).
The single thing that I found about ConcurrentHashMap guarantees, is:

Actions in a thread prior to placing an object into any concurrent collection happen-before actions subsequent to the access or removal of that element from the collection in another thread.

from here. If I got it right, it literally means, that get() could return any value that actually was sometime into Map, as far as it doesn't ruin happens-before relationship between actions in different threads.
In my case it applies like this: if third request doesn't care about what happened during processing of first and second, then it could read null from Map.

It doesn't suit me, because I really need to get from Map the latest actual Record.

What have I tried
So I started to think, how to form happens-before relationship between subsequent Map modifications; and came with idea. JLS says (in 17.4.4) that:

A write to a volatile variable v (§8.3.1.4) synchronizes-with all subsequent reads of v by any thread (where "subsequent" is defined according to the synchronization order).

So, let's suppose, I'll modify my singleton like this:

public class RecordsMapSingleton {

    private static final ConcurrentHashMap<String,Record> payments = new ConcurrentHashMap<>();
    private static volatile long revision = 0;

    public static ConcurrentHashMap<String, Record> getInstance() {
        return payments;
    }

    public static void incrementRevision() {
        revision++;
    }
    public static long getRevision() {
        return revision;
    }

}

Then, after each modification of Map or Record inside, I'll call incrementRevision() and before any read from Map I'll call getRevision().

Question
Due to nature of heisenbugs no amount of tests is enough to tell that this solution is correct. And I'm not an expert in concurrency, so couldn't verify it formally.

Can someone approve, that following this approach guarantees that I'm always going to get the latest actual value from ConcurrentHashMap? If this approach is incorrect or appears to be inefficient, could you recommend me something else?

like image 852
mkrakhin Avatar asked Apr 21 '15 12:04

mkrakhin


People also ask

How do I get value from ConcurrentHashMap?

The get() method of ConcurrentHashMap class returns the value to which the specified key is mapped, or null if this map contains no mapping for the key.

Do we have lock while getting value from ConcurrentHashMap?

ConcurrentHashMap: It allows concurrent access to the map. Part of the map called Segment (internal data structure) is only getting locked while adding or updating the map. So ConcurrentHashMap allows concurrent threads to read the value without locking at all.

How does ConcurrentHashMap ensure thread-safety?

ConcurrentHashMap class achieves thread-safety by dividing the map into segments, the lock is required not for the entire object but for one segment, i.e one thread requires a lock of one segment. In ConcurrentHashap the read operation doesn't require any lock.

Can two threads update the ConcurrentHashMap simultaneously?

There can't be two threads changing things at the same time. The whole point of using such data structures is that they prevent more than one thread updating that "core internal data" at the same time. Having two threads that change the map at the very same point time is not possible.


1 Answers

You approach will not work as you are actually repeating the same mistake again. As ConcurrentHashMap.put and ConcurrentHashMap.get will create a happens before relationship but no time ordering guaranty, exactly the same applies to your reads and writes to the volatile variable. They form a happens before relationship but no time ordering guaranty, if one thread happens to call get before the other performed put, the same applies to the volatile read that will happen before the volatile write then. Besides that, you are adding another error as applying the ++ operator to a volatile variable is not atomic.

The guarantees made for volatile variables are not stronger than these made for a ConcurrentHashMap. It’s documentation explicitly states:

Retrievals reflect the results of the most recently completed update operations holding upon their onset.

The JLS states that external actions are inter-thread actions regarding the program order:

An inter-thread action is an action performed by one thread that can be detected or directly influenced by another thread. There are several kinds of inter-thread action that a program may perform:

  • External Actions. An external action is an action that may be observable outside of an execution, and has a result based on an environment external to the execution.

Simply said, if one thread puts into a ConcurrentHashMap and sends a message to an external entity and a second thread gets from the same ConcurrentHashMap after receiving a message from an external entity depending on the previously sent message, there can’t be a memory visibility issue.

It might be the case that these action aren’t programmed that way or that the external entity doesn’t have the assumed dependency, but it might be the case that the error lies in a completely different area but we can’t tell as you didn’t post the relevant code, e.g. the key doesn’t match or the printing code is wrong. But whatever it is, it won’t be fixed by the volatile variable.

like image 122
Holger Avatar answered Oct 11 '22 02:10

Holger