Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Thread-safe memoization

Let's take Wes Dyer's approach to function memoization as the starting point:

public static Func<A, R> Memoize<A, R>(this Func<A, R> f) {   var map = new Dictionary<A, R>();   return a =>     {       R value;       if (map.TryGetValue(a, out value))         return value;       value = f(a);       map.Add(a, value);       return value;     }; } 

The problem is, when using it from multiple threads, we can get into trouble:

Func<int, int> f = ... var f1 = f.Memoize(); ... in thread 1: var y1 = f1(1); in thread 2: var y2 = f1(1); // We may be recalculating f(1) here! 

Let's try to avoid this. Locking on map:

public static Func<A, R> Memoize<A, R>(this Func<A, R> f) {   var map = new Dictionary<A, R>();   return a =>     {       R value;       lock(map)        {         if (map.TryGetValue(a, out value))           return value;         value = f(a);         map.Add(a, value);       }         return value;     }; } 

is clearly a horrible idea, because it prevents us from calculating f1 on many different arguments at once. Locking on a won't work if a has a value type (and at any rate is a bad idea, since we don't control a and outside code may lock on it, too).

Here are two options I can think of:

Assuming a Lazy<T> class for lazy evaluation (see here):

public static Func<A, R> Memoize<A, R>(this Func<A, R> f) {   var map = new Dictionary<A, Lazy<R>>();   return a =>     {       Lazy<R> result;       lock(map)        {         if (!map.TryGetValue(a, out result))         {             result = () => f(a);           map.Add(a, result);         }       }       return result.Value;     }; } 

Or keeping an additional dictionary of objects for synchronization:

public static Func<A, R> Memoize<A, R>(this Func<A, R> f) {   var map = new Dictionary<A, R>();   var mapSync = new Dictionary<A, object>();   return a =>     {       R value;       object sync;       lock(mapSync)       {          if (!mapSync.TryGetValue(a, out sync))         {            sync = new object();           mapSync[a] = sync;         }       }       lock(map)       {         if (map.TryGetValue(a, out value))           return value;       }       lock(sync)       {         value = f(a);         lock(map)         {           map[a] = value;         }         return value;       }     }; } 

Any better options?

like image 857
Alexey Romanov Avatar asked Aug 10 '09 13:08

Alexey Romanov


People also ask

Are Java suppliers thread safe?

memoize. The returned supplier is thread-safe. The delegate's get() method will be invoked at most once. The supplier's serialized form does not contain the cached value, which will be recalculated when get() is called on the reserialized instance.

Is memoization a cache?

Is memoization same as caching? Yes, kind of. Memoization is actually a specific type of caching. While caching can refer in general to any storing technique (like HTTP caching) for future use, memoizing specifically involves caching the return values of a function .

What are the benefits of memoization?

It helps avoid waste by removing the need to recalculate values that have already been produced as part of a previous call. The benefits of memoization will be less apparent in functions that are simple to begin with or infrequently called.


1 Answers

Use .net 4.0's ConcurrentDictionary<A, R> without the unnecessary Lazy<R>.
The key is GetOrAdd(A, Func<A, R>) which renders into a beautifully trivial lambda.

public static Func<A, R> Memoize<A, R>(this Func<A, R> f) {     var cache = new ConcurrentDictionary<A, R>();     return a => cache.GetOrAdd(a, f); }; 

Update The above solution does allow multiple simultaneous readers & writers with the minimum of overhead. But, it doesn't prevent f(a) from being executed more than once for the same value (during the period while it is being calculated).

If that is vital to you, you could wrap the value in Lazy<R> but you incur a cost for every read.

public static Func<A, R> Memoize<A, R>(this Func<A, R> f) {     var cache = new ConcurrentDictionary<A, Lazy<R>>();     return a => cache.GetOrAdd(a, new Lazy<R>(() => f(a))).Value; } 

Update Timing tests for a million reads of a pre-populated 1000-item cache show 19ms for ConcurrentDictionary -- same as regular Dictionary -- but 720ms for the Lazy version.

If that sounds too steep, you can get the best of both worlds with a more complex solution.

public static Func<A, R> Memoize<A, R>(this Func<A, R> f) {     var cache = new ConcurrentDictionary<A, R>();     var syncMap = new ConcurrentDictionary<A, object>();     return a =>     {         R r;         if (!cache.TryGetValue(a, out r))         {             var sync = syncMap.GetOrAdd(a, new object());             lock (sync)             {                 r = cache.GetOrAdd(a, f);             }             syncMap.TryRemove(a, out sync);         }         return r;     }; } 
like image 164
Nigel Touch Avatar answered Sep 29 '22 13:09

Nigel Touch