Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cache Function results

Tags:

c#

.net

caching

For fun, I'm playing with a class to easily cache function results. The basic idea is that you can take any function you want — though you'd only want to use it for relatively expensive functions — and easily wrap it to use relatively inexpensive dictionary lookups for later runs with the same argument. There's really not much to it:

public class AutoCache<TKey, TValue> 
{  
    public AutoCache(Func<TKey, TValue> FunctionToCache)
    {
        _StoredFunction = FunctionToCache;
        _CachedData = new Dictionary<TKey, TValue>();
    }

    public TValue GetResult(TKey Key)
    {
        if (!_CachedData.ContainsKey(Key)) 
            _CachedData.Add(Key, _StoredFunction(Key));
        return _CachedData[Key];
    }

    public void InvalidateKey(TKey Key)
    {
        _CachedData.Remove(Key);
    }

    public void InvalidateAll()
    {
        _CachedData.Clear();
    }

    private Dictionary<TKey, TValue> _CachedData;
    private Func<TKey, TValue> _StoredFunction; 
}

Unfortunately, there are some additional restrictions that make this much less useful than it could be. There are also some features we could add and other considerations to the implementation. I'm looking for thoughts on ways this can be improved for any of the following points:

  • This requires a function that returns the same result for a given set of arguments (it must be stateless). Probably no way to change this.
  • It's limited to a very narrow delegate range. Could we expand it to easily work for any function that accepts at least one parameter and returns a value, perhaps by wrapping arguments in an anonymous type? Or would we need an additional implemenation for each Func delegate we wanted to support? If so, can we build an abstract class to make this easier?
  • It's not thread-safe.
  • No automatic invalidation. This makes it dangerous for garbage collection. You need to keep it around for a while for it to be useful, and that means you're not going to really ever discard old and potentially un-needed cache items.
  • Can we inherit from this to make the cache bi-directional for the case where the function has a single argument?

As a point of reference, if I ever use this in real code the most likely place I envision it is as part of a business logic layer, where I use this code to wrap a method in the data access layer that just pulls data from a lookup table. In this case, the database trip would be expensive relative to the dictionary and there would almost always be exactly one 'key' value for the lookup, so it's a good match.

like image 215
Joel Coehoorn Avatar asked Jul 22 '09 16:07

Joel Coehoorn


2 Answers

Another name for this automatic caching of function results is memoization. For a public interface, consider something along these lines:

public Func<T,TResult> Memoize<T,TResult>(Func<T,TResult> f)

... and simply use polymorphism to store T's in a dictionary of object.

Extending the delegate range could be implemented via currying and partial function application. Something like this:

static Func<T1,Func<T2,TResult>> Curry(Func<T1,T2,TResult> f)
{
    return x => y => f(x, y);
}
// more versions of Curry

Since Curry turns functions of multiple arguments into functions of single arguments (but that may return functions), the return values are eligible for memoization themselves.

Another way to do it would be to use reflection to inspect the delegate type, and store tuples in the dictionary rather than simply the argument type. A simplistic tuple would be simply an array wrapper whose hashcode and equality logic used deep comparisons and hashing.

Invalidation could be helped with weak references, but creating dictionaries with WeakReference keys is tricky - it's best done with the support of the runtime (WeakReference values is much easier). I believe there are some implementations out there.

Thread safety is easily done by locking on the internal dictionary for mutation events, but having a lock-free dictionary may improve performance in heavily concurrent scenarios. That dictionary would probably be even harder to create - there's an interesting presentation on one for Java here though.

like image 110
Barry Kelly Avatar answered Sep 22 '22 22:09

Barry Kelly


Wow - what serendipity - I had just recently posted a question about opaque keys in C# ... and because I'm trying to implement something related to function result caching. How funny.

This type of metaprogramming can be tricky with C# ... especially because generic type parameters can result in awkward code duplication. You often end up repeating almost the same code in multiple places, with different type parameters, to achieve type safety.

So here's my variation on your approach that uses my opaque key pattern and closures to create cacheable functions. The sample below demonstrates the pattern with either one or two arguments, but it's relatively easy to extend to more. It also uses extension methods to create a transparent pattern for wrapping a Func<> with a cachable Func<> using the AsCacheable() method. Closures capture the cache that is associated with the function - and make it's existence transparent to other callers.

This technique has many of the same limitations as your approach (thread safety, holding on to references, etc) - I suspect they aren't too hard to overcome - but it DOES support an easy way to extend to multiple parameters, and it allows cacheable functions to be completely substitutable with regular ones - since they are just a wrapper delegate.

It's also worth noting that if you create a second instance of the CacheableFunction - you get a separate cache. This can be both a strength and a weakness ... since it some situations you may not realize this is happening.

Here's the code:

public interface IFunctionCache
{
    void InvalidateAll();
    // we could add more overloads here...
}

public static class Function
{
    public class OpaqueKey<A, B>
    {
        private readonly object m_Key;

        public A First { get; private set; }
        public B Second { get; private set; }

        public OpaqueKey(A k1, B k2)
        {
            m_Key = new { K1 = k1, K2 = k2 };
            First = k1;
            Second = k2;
        }

        public override bool Equals(object obj)
        {
            var otherKey = obj as OpaqueKey<A, B>;
            return otherKey == null ? false : m_Key.Equals(otherKey.m_Key);
        }

        public override int GetHashCode()
        {
            return m_Key.GetHashCode();
        }
    }

    private class AutoCache<TArgs,TR> : IFunctionCache
    {
        private readonly Dictionary<TArgs,TR> m_CachedResults 
            = new Dictionary<TArgs, TR>();

        public bool IsCached( TArgs arg1 )
        {
            return m_CachedResults.ContainsKey( arg1 );
        }

        public TR AddCachedValue( TArgs arg1, TR value )
        {
            m_CachedResults.Add( arg1, value );
            return value;
        }

        public TR GetCachedValue( TArgs arg1 )
        {
            return m_CachedResults[arg1];
        }

        public void InvalidateAll()
        {
            m_CachedResults.Clear();
        }
    }

    public static Func<A,TR> AsCacheable<A,TR>( this Func<A,TR> function )
    {
        IFunctionCache ignored;
        return AsCacheable( function, out ignored );
    }

    public static Func<A, TR> AsCacheable<A, TR>( this Func<A, TR> function, out IFunctionCache cache)
    {
        var autocache = new AutoCache<A,TR>();
        cache = autocache;
        return (a => autocache.IsCached(a) ?
                     autocache.GetCachedValue(a) :
                     autocache.AddCachedValue(a, function(a)));
    }

    public static Func<A,B,TR> AsCacheable<A,B,TR>( this Func<A,B,TR> function )
    {
        IFunctionCache ignored;
        return AsCacheable(function, out ignored);
    }

    public static Func<A,B,TR> AsCacheable<A,B,TR>( this Func<A,B,TR> function, out IFunctionCache cache )
    {
        var autocache = new AutoCache<OpaqueKey<A, B>, TR>();
        cache = autocache;
        return ( a, b ) =>
                   {
                       var key = new OpaqueKey<A, B>( a, b );
                       return autocache.IsCached(key)
                                  ? autocache.GetCachedValue(key)
                                  : autocache.AddCachedValue(key, function(a, b));
                   };
    }
}

public class CacheableFunctionTests
{
    public static void Main( string[] args )
    {
        Func<string, string> Reversal = s => new string( s.Reverse().ToArray() );

        var CacheableReverse = Reversal.AsCacheable();

        var reverse1 = CacheableReverse("Hello");
        var reverse2 = CacheableReverse("Hello"); // step through to prove it uses caching

        Func<int, int, double> Average = (a,b) => (a + b)/2.0;
        var CacheableAverage = Average.AsCacheable();

        var average1 = CacheableAverage(2, 4);
        var average2 = CacheableAverage(2, 4);
    }
}
like image 45
LBushkin Avatar answered Sep 19 '22 22:09

LBushkin