Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Updating and swapping HashMaps with volatile

Tags:

java

volatile

Background

I have a large data map (HashMap), kept in memory, which is updated incrementally (based on incoming messages), by the background thread:

<KEY> => <VALUE>
...

End users will then query it via the REST API:

GET /lookup?key=<KEY>

Updates are not applied immediately, but in batches, once a special control message is received, i.e.

MESSAGE: "Add A" 

A=<VALUE>   //Not visible yet

MESSAGE: "Add B"

B=<VALUE>   //Not visible yet

MESSAGE: "Commit"

//Updates are now visible to the end-users
A=<VALUE>
B=<VALUE

The architecture I devised, is as follows:

volatile Map passiveCopy = new HashMap();
volatile Map activeCopy = new HashMap();

Map<String,Object> pendingUpdates; 

//Interactive requests (REST API)
Object lookup(String key) {
     activeCopy.get(key);
}

//Background thread processing the incoming messages.
//Messages are processed strictly sequentially
//i.e. no other message will be processed, until
//current handleMessage() invocation is completed
//(that is guaranteed by the message processing framework itself)
void handleMessage(Message msg) {

   //New updates go to the pending updates temporary map
   if(msg.type() == ADD) {
      pendingUpdates.put(msg.getKey(),msg.getValue()); 
   }


   if(msg.type() == COMMIT) {     
      //Apply updates to the passive copy of the map
      passiveCopy.addAll(pendingUpdates);

      //Swap active and passive map copies
      Map old = activeCopy; 
      activeCopy = passiveCopy;
      passiveCopy = old;

      //Grace period, wait for on-the-air requests to complete
      //REST API has a hard timeout of 100ms, so no client
      //will wait for the response longer than that 
      Thread.sleep(1000);

      //Re-apply updates to the now-passive (ex-active) copy of the map
      passiveCopy.addAll(pendingUpdates);

      //Reset the pendingUpdates map
      pendingUpdates.clear();
   }

}

The question

Taking that write->read to the volatile field makes a happens-before edge:

A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.

https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4.5

and the grace period is chosen correctly, I expect that any updates applied to the passiveCopy (via putAll()), will become visible to the end-user requests (all at once), after the swap.

It is really a case, or there are any corner-case which will make this approach fail ?

NOTE

I know that creating a copy of the Map (so that a new Map instance is assigned to activeCopy an each time), would be safe to do, but I don't want to do this (as it is really large).

like image 666
bitbit9 Avatar asked Nov 06 '22 16:11

bitbit9


1 Answers

Apart from your inconsistent use of activeMap and activeCopy (just remove activeCopy and only swap between activeMap and passiveCopy), your approach is sensible.

This answer quotes the JLS:

If x and y are actions of the same thread and x comes before y in program order, then hb(x,y) [x "happens before" y].

An example is also given in this answer.

From that I take that accesses to a volatile variable/field are basically sequence points; in your case, because the swap comes after the modification of the map in the program code, it should be guaranteed that the modification of the map is completed before the access to the volatile field is actually performed. So no race condition here.

However, in most cases you should use synchronized or explicit locks to synchronize concurrent executions. The only reason to code around using these is if you need high performance, i.e. massive parallelism, where it's either not acceptable for threads to block a lock, or the desired parallelism is so high that threads begin to starve.

That said, I believe you should really just 'invest' in proper mutual exclusion, preferredly using a ReadWriteLock. Because synchronized (which is used by ReadWriteLock internally) implies a memory barrier, you don't need volatile anymore.

For example:

final ReentrantReadWriteLock rwLock = new ReentrantReadWriteLock();
final Lock readLock = rwLock.getReadLock();
final Lock writeLock = rwLock.getWriteLock();

Map passiveCopy = new HashMap();
Map activeMap = new HashMap();

final Map<String,Object> pendingUpdates = new HashMap(); 

//Interactive requests (REST API)
Object lookup(String key) {

  readLock.lock();
  try {
     return activeMap.get(key);
  } finally {
    readLock.unlock();
  }
}

//Background thread processing the incoming messages.
//Messages are processed strictly sequentially
//i.e. no other message will be processed, until
//current handleMessage() invocation is completed
//(that is guaranteed by the message processing framework itself)
void handleMessage(Message msg) {

   //New updates go to the pending updates temporary map
   if(msg.type() == ADD) {
      pendingUpdates.put(msg.getKey(),msg.getValue()); 
   }


   if(msg.type() == COMMIT) {     
      //Apply updates to the passive copy of the map
      passiveCopy.addAll(pendingUpdates);

      final Map tempMap = passiveCopy;    

      writeLock.lock();

      try {
        passiveCopy = activeMap;
        activeMap = tempMap;
      } finally {
        writeLock.unlock();
      }

      // Update the now-passive copy to the same state as the active map:
      passiveCopy.addAll(pendingUpdates);
      pendingUpdates.clear();
   }

}

From your code, however, I read that 'readers' should see a consistent version of the map during their 'lifetime', which is not guaranteed by the above code, i.e. if a single 'reader' accesses the map twice he may see two different maps. This could be solved by having each reader acquire the read lock itself before the first access to the map, releasing it after the last access to the map. This may or may not work in your case because if the readers hold the lock for extended periods, or there are many reader threads, it may block/starve the writer thread trying to commit the update.

like image 127
JimmyB Avatar answered Nov 14 '22 21:11

JimmyB