I have a large data map (HashMap), kept in memory, which is updated incrementally (based on incoming messages), by the background thread:
<KEY> => <VALUE>
...
End users will then query it via the REST API:
GET /lookup?key=<KEY>
Updates are not applied immediately, but in batches, once a special control message is received, i.e.
MESSAGE: "Add A"
A=<VALUE> //Not visible yet
MESSAGE: "Add B"
B=<VALUE> //Not visible yet
MESSAGE: "Commit"
//Updates are now visible to the end-users
A=<VALUE>
B=<VALUE
The architecture I devised, is as follows:
volatile Map passiveCopy = new HashMap();
volatile Map activeCopy = new HashMap();
Map<String,Object> pendingUpdates;
//Interactive requests (REST API)
Object lookup(String key) {
activeCopy.get(key);
}
//Background thread processing the incoming messages.
//Messages are processed strictly sequentially
//i.e. no other message will be processed, until
//current handleMessage() invocation is completed
//(that is guaranteed by the message processing framework itself)
void handleMessage(Message msg) {
//New updates go to the pending updates temporary map
if(msg.type() == ADD) {
pendingUpdates.put(msg.getKey(),msg.getValue());
}
if(msg.type() == COMMIT) {
//Apply updates to the passive copy of the map
passiveCopy.addAll(pendingUpdates);
//Swap active and passive map copies
Map old = activeCopy;
activeCopy = passiveCopy;
passiveCopy = old;
//Grace period, wait for on-the-air requests to complete
//REST API has a hard timeout of 100ms, so no client
//will wait for the response longer than that
Thread.sleep(1000);
//Re-apply updates to the now-passive (ex-active) copy of the map
passiveCopy.addAll(pendingUpdates);
//Reset the pendingUpdates map
pendingUpdates.clear();
}
}
Taking that write->read to the volatile field makes a happens-before edge:
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4.5
and the grace period is chosen correctly, I expect that any updates applied to the passiveCopy (via putAll()), will become visible to the end-user requests (all at once), after the swap.
It is really a case, or there are any corner-case which will make this approach fail ?
NOTE
I know that creating a copy of the Map (so that a new Map instance is assigned to activeCopy an each time), would be safe to do, but I don't want to do this (as it is really large).
Apart from your inconsistent use of activeMap
and activeCopy
(just remove activeCopy
and only swap between activeMap
and passiveCopy
), your approach is sensible.
This answer quotes the JLS:
If x and y are actions of the same thread and x comes before y in program order, then hb(x,y) [x "happens before" y].
An example is also given in this answer.
From that I take that accesses to a volatile variable/field are basically sequence points; in your case, because the swap comes after the modification of the map in the program code, it should be guaranteed that the modification of the map is completed before the access to the volatile field is actually performed. So no race condition here.
However, in most cases you should use synchronized
or explicit locks to synchronize concurrent executions. The only reason to code around using these is if you need high performance, i.e. massive parallelism, where it's either not acceptable for threads to block a lock, or the desired parallelism is so high that threads begin to starve.
That said, I believe you should really just 'invest' in proper mutual exclusion, preferredly using a ReadWriteLock
. Because synchronized
(which is used by ReadWriteLock
internally) implies a memory barrier, you don't need volatile
anymore.
For example:
final ReentrantReadWriteLock rwLock = new ReentrantReadWriteLock();
final Lock readLock = rwLock.getReadLock();
final Lock writeLock = rwLock.getWriteLock();
Map passiveCopy = new HashMap();
Map activeMap = new HashMap();
final Map<String,Object> pendingUpdates = new HashMap();
//Interactive requests (REST API)
Object lookup(String key) {
readLock.lock();
try {
return activeMap.get(key);
} finally {
readLock.unlock();
}
}
//Background thread processing the incoming messages.
//Messages are processed strictly sequentially
//i.e. no other message will be processed, until
//current handleMessage() invocation is completed
//(that is guaranteed by the message processing framework itself)
void handleMessage(Message msg) {
//New updates go to the pending updates temporary map
if(msg.type() == ADD) {
pendingUpdates.put(msg.getKey(),msg.getValue());
}
if(msg.type() == COMMIT) {
//Apply updates to the passive copy of the map
passiveCopy.addAll(pendingUpdates);
final Map tempMap = passiveCopy;
writeLock.lock();
try {
passiveCopy = activeMap;
activeMap = tempMap;
} finally {
writeLock.unlock();
}
// Update the now-passive copy to the same state as the active map:
passiveCopy.addAll(pendingUpdates);
pendingUpdates.clear();
}
}
From your code, however, I read that 'readers' should see a consistent version of the map during their 'lifetime', which is not guaranteed by the above code, i.e. if a single 'reader' accesses the map twice he may see two different maps. This could be solved by having each reader acquire the read lock itself before the first access to the map, releasing it after the last access to the map. This may or may not work in your case because if the readers hold the lock for extended periods, or there are many reader threads, it may block/starve the writer thread trying to commit the update.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With