Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Thread safety for high-performance in-memory cache

I have a static in-memory cache that is written to only once an hour (or longer), and read by many threads at an extremely high rate. Conventional wisdom suggests I follow a pattern such as the following:

public static class MyCache
{
    private static IDictionary<int, string> _cache;
    private static ReaderWriterLockSlim _sharedLock;

    static MyCache()
    {
        _cache = new Dictionary<int, string>();
        _sharedLock = new ReaderWriterLockSlim();
    }

    public static string GetData(int key)
    {
        _sharedLock.EnterReadLock();
        try
        {
            string returnValue;
            _cache.TryGetValue(key, out returnValue);
            return returnValue;
        }
        finally
        {
            _sharedLock.ExitReadLock();
        }
    }

    public static void AddData(int key, string data)
    {
        _sharedLock.EnterWriteLock();
        try
        {
            if (!_cache.ContainsKey(key))
                _cache.Add(key, data);
        }
        finally
        {
            _sharedLock.ExitWriteLock();
        }
    }
}

As an excercise in micro-optimization, how can I shave off even more ticks in the relative expense of shared read locks? Time to write can be expensive, since it rarely happens. I need to make reads as fast as possible. Can I just drop the read locks (below) and remain thread-safe in this scenario? Or is there a lock-free version I can use? I'm familiar with memory-fencing but don't know how to safely apply it in this instance.

Note: I'm not tied to either pattern so any suggestions are welcome as long as the end result is faster and in C# 4.x.*

public static class MyCache2
{
    private static IDictionary<int, string> _cache;
    private static object _fullLock;

    static MyCache2()
    {
        _cache = new Dictionary<int, string>();
        _fullLock = new object();
    }

    public static string GetData(int key)
    {
        //Note: There is no locking here... Is that ok?
        string returnValue;
        _cache.TryGetValue(key, out returnValue);
        return returnValue;
    }

    public static void AddData(int key, string data)
    {
        lock (_fullLock)
        {
            if (!_cache.ContainsKey(key))
                _cache.Add(key, data);
        }
    }
}
like image 895
JoeGeeky Avatar asked Nov 19 '11 16:11

JoeGeeky


2 Answers

You don't need a lock when there are threads only ever reading from the data structure. So, since writes are so rare (and, I assume, not concurrent), an option might be to make a full copy of the dictionary, make the modifications to the copy, and then atomically exchange the old dictionary with the new one:

public static class MyCache2
{
    private static IDictionary<int, string> _cache;

    static MyCache2()
    {
        _cache = new Dictionary<int, string>();
    }

    public static string GetData(int key)
    {
        string returnValue;
        _cache.TryGetValue(key, out returnValue);
        return returnValue;
    }

    public static void AddData(int key, string data)
    {
        IDictionary<int, string> clone = Clone(_cache);
        if (!clone.ContainsKey(key))
            clone.Add(key, data);
        Interlocked.Exchange(ref _cache, clone);
    }
}
like image 93
dtb Avatar answered Oct 05 '22 11:10

dtb


I would be looking to go lock free here, and achieve thread safety by simply not changing any published dictionary. What I mean is: when you need to add data, create a complete copy of the dictionary, and append/update/etc the copy. Since this is once an hour this shouldn't be a problem even for large data. Then, when you have made the changes, simply swap the reference from the old dictionary to the new dictionary (reference reads/writes are guaranteed to be atomic).

One caveat: any code that needs consistent state between multiple operations should capture the dictionary into a variable first, I.e.

var snapshot = someField;
// multiple reads on snapshot

This ensures that any related logic is all made using the same version of the data, to avoid confusion when the reference swaps during the operation.

I would also take a lock when writing (not when reading) to ensure no squabbling over the data. There are lock-free multi-writer approaches too (primarily Interlocked.CompareExchange and reapply if it fails), but I would use the simplest approach first, and a single writer is exactly that.

like image 30
Marc Gravell Avatar answered Oct 05 '22 13:10

Marc Gravell