Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HashMap vs ConcurrentHashMap vs LoadingCache(Guava)

To locally cache some data in spring boot application which technique would be better in terms of read/write operations? HashMap vs ConcurrentHashMap vs LoadingCache(Guava library) I tried writing and reading operations on each of these, HashMap was fastest and LoadingCache was slowest, then why should we use LoadingCache, for what purpose?

Edit: The application is multithreaded. And functionalities like maximum size of cache, expiry time can be compromised. Also, Main motive is to increase read speed.

like image 339
Spring boot progammer Avatar asked Oct 24 '25 02:10

Spring boot progammer


1 Answers

With respect to performance, it depends on the size of your data and the ratio of modifications in between reads. Here is a suggestion:

Static data: If your data is static, initialize a read only map in the constructor like so:

final Map map;
MyClass(Map inputMap) {
  map = Map.copyOf(inputMap);
}
Object get(Object key) {
  return map.get(key);
}

Rare modifications: If you have rare modifications and the data is not too big:

volatile Map map = Map.of();
Object synchronized put(Object key, Object value) {
  Map mutable = new HashMap(map);
  mutable.put(key, value);
  map = Map.copyOf(mutable);
}
Object get(Object key) {
  return map.get(key);
}

The Map.copyOf is available since Java 9. It creates an immutable hash table, that is using an open addressing scheme, unlike HashMap. That will be even faster than the HashMap. You can also use the HashMap with the scheme above in a multi threaded environment, since it is not modified once it was created.

synchronized is needed, to make sure that you do not use an update if multiple threads use put at the same time. volatile is needed to make sure the update becomes visible in other threads.

Main motive is to increase read speed.

So, the solutions above would give the best read speed but compromise on update speed.

Lots of data and/or lots of modifications: Use the ConcurrentHashMap.

Even if there is a slight performance benefit I recommend using ConcurrentHashMap, because:

  • It is less error prone and ConcurrentHashMap is proven to work. Will you write unit tests with multiple threads proofing your code works correctly?
  • Less code. Less bugs
  • Less code. Less confusion for your fellow developers
  • The usage pattern might change over time and your home grown "performance improvement" will turn into a "performance problem".

Footnotes:

Use of a Cache

Cache and LoadingCache: The Guava LoadingCache is meant to be used with a CacheLoader. A cache loader is useful to make the cache automatically populate the cache and or do refreshs. Guava cache is outdated meanwhile, I recommend looking at Caffine or cache2k, when looking for a caching solution working in the Java heap.

A cache always has additional overhead in the read path because it needs to do some bookkeeping to know which entries are currently accessed. That overhead is minimal in cache2k, at least according to my (disclaimer...) benchmarks.

Spring Boot

When used with the Spring cache abstraction, e.g. with @Cacheable there will be not a big performance difference within the implementations, since the cache abstraction has a very relevant overhead as well.

The simple cache implementation in Spring that is based on ConcurrentHashMap is only meant for testing and prototyping. I recommend to always use a real cache implementation as soon as possible and set reasonable resource limits.

Profile and optimize the whole application

Every optimization you do has trade offs, so you should always look at the whole application and compare "optimizations" to the simplest or most common solution possible.

like image 77
cruftex Avatar answered Oct 25 '25 17:10

cruftex