Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ThreadLocal HashMap vs ConcurrentHashMap for thread-safe unbound caches

I'm creating a memoization cache with the following characteristics:

  • a cache miss will result in computing and storing an entry
    • this computation is very expensive
    • this computation is idempotent
  • unbounded (entries never removed) since:
    • the inputs would result in at most 500 entries
    • each stored entry is very small
    • cache is relatively shorted-lived (typically less than an hour)
    • overall, memory usage isn't an issue
  • there will be thousands of reads - over the cache's lifetime, I expect 99.9%+ cache hits
  • must be thread-safe

What would have superior performance, or under what conditions would one solution be favored over the other?

ThreadLocal HashMap:

class MyCache {
    private static class LocalMyCache {
        final Map<K,V> map = new HashMap<K,V>();

        V get(K key) {
            V val = map.get(key);
            if (val == null) {
                val = computeVal(key);
                map.put(key, val);
            }
            return val;
        }
    }

    private final ThreadLocal<LocalMyCache> localCaches = new ThreadLocal<LocalMyCache>() {
        protected LocalMyCache initialValue() {
            return new LocalMyCache();
        }
    };

    public V get(K key) {
        return localCaches.get().get(key);
    }
}

ConcurrentHashMap:

class MyCache {
    private final ConcurrentHashMap<K,V> map = new ConcurrentHashMap<K,V>();

    public V get(K key) {
        V val = map.get(key);
        if (val == null) {
            val = computeVal(key);
            map.put(key, val);
        }
        return val;
    }
}

I figure the ThreadLocal solution would initially be slower if there a lot of threads because of all the cache misses per thread, but over thousands of reads, the amortized cost would be lower than the ConcurrentHashMap solution. Is my intuition correct?

Or is there an even better solution?

like image 415
Maian Avatar asked Jan 12 '13 15:01

Maian


People also ask

Is ThreadLocal thread safe?

Thread Safety With the exception of Dispose(), all public and protected members of ThreadLocal<T> are thread-safe and may be used concurrently from multiple threads.

Is ConcurrentHashMap thread safe?

ConcurrentHashMap class is thread-safe i.e. multiple threads can operate on a single object without any complications. At a time any number of threads are applicable for a read operation without locking the ConcurrentHashMap object which is not there in HashMap.

When should I use ThreadLocal?

Java ThreadLocal is used to create thread local variables. We know that all threads of an Object share it's variables, so the variable is not thread safe. We can use synchronization for thread safety but if we want to avoid synchronization, we can use ThreadLocal variables.

Is HashMap remove thread safe?

Well, HashMap is not thread-safe. If multiple threads are accessing the same HashMap object and try to modify the structure of the HashMap (using put() or remove() method), it may cause an inconsistency in the state of HashMap .


2 Answers

use ThreadLocal as cache is a not good practice

In most containers, threads are reused via thread pools and thus are never gc. this would lead something wired

use ConcurrentHashMap you have to manage it in order to prevent mem leak

if you insist, i suggest using week or soft ref and evict after rich maxsize

if you are finding a in mem cache solution ( do not reinventing the wheel ) try guava cache http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/cache/CacheBuilder.html

like image 59
farmer1992 Avatar answered Oct 27 '22 00:10

farmer1992


this computation is very expensive

I assume this is the reason you created the cache and this should be your primary concern.

While the speed of the solutions might be slightly different << 100 ns, I suspect it is more important that you be able to share results between threads. i.e. ConcurrentHashMap is likely to be the best for your application is it is likely to save you more CPU time in the long run.

In short, the speed of you solution is likely to be tiny compared to the cost of computing the same thing multiple times (for multiple threads)

like image 35
Peter Lawrey Avatar answered Oct 27 '22 01:10

Peter Lawrey