Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AppFabric Cache concurrency issue?

Tags:

appfabric

While stress testing prototype of our brand new primary system, I run into concurrent issue with AppFabric Cache. When concurrently calling many DataCache.Get() and Put() with same cacheKey, where I attempt to store relatively large objet, I recieve "ErrorCode:SubStatus:There is a temporary failure. Please retry later." It is reproducible by the following code:

        var dcfc = new DataCacheFactoryConfiguration
        {
            Servers = new[] {new DataCacheServerEndpoint("localhost", 22233)},
            SecurityProperties = new DataCacheSecurity(DataCacheSecurityMode.None, DataCacheProtectionLevel.None),
        };

        var dcf = new DataCacheFactory(dcfc);
        var dc = dcf.GetDefaultCache();

        const string key = "a";
        var value = new int [256 * 1024]; // 1MB

        for (int i = 0; i < 300; i++)
        {
            var putT = new Thread(() => dc.Put(key, value));
            putT.Start();               

            var getT = new Thread(() => dc.Get(key));
            getT.Start();
        }

When calling Get() with different key or DataCache is synchronized, this issue will not appear. If DataCache is obtained with each call from DataCacheFactory (DataCache is supposed to be thread-safe) or timeouts are prolonged it has no effect and error is still received. It seems to me very strange that MS would leave such bug. Did anybody faced similar issue?

like image 446
Frantisek Jandos Avatar asked Jul 28 '11 13:07

Frantisek Jandos


2 Answers

I also see the same behavior and my understanding is that this is by design. The cache contains two concurrency models:

  • Optimistic Concurrency Model methods: Get, Put, ...
  • Pessimistic Concurrency Model: GetAndLock, PutAndLock, Unlock

If you use optimistic concurrency model methods like Get then you have to be ready to get DataCacheErrorCode.RetryLater and handle that appropriately - I also use a retry approach.

You might find more information at MSDN: Concurrency Models

like image 104
David Pokluda Avatar answered Sep 19 '22 09:09

David Pokluda


We have seen this problem as well in our code. We solve this by overloading the Get method to catch expections and then retry the call N times before fallback to a direct request to SQL.

Here is a code that we use to get data from the cache

    private static bool TryGetFromCache(string cacheKey, string region, out GetMappingValuesToCacheResult cacheResult, int counter = 0)
    {
    cacheResult = new GetMappingValuesToCacheResult();

    try
    {
        // use as instead of cast, as this will return null instead of exception caused by casting.
        if (_cache == null) return false;

        cacheResult = _cache.Get(cacheKey, region) as GetMappingValuesToCacheResult;

        return cacheResult != null;
    }
    catch (DataCacheException dataCacheException)
    {
        switch (dataCacheException.ErrorCode)
        {
            case DataCacheErrorCode.KeyDoesNotExist:
            case DataCacheErrorCode.RegionDoesNotExist:
                return false;
            case DataCacheErrorCode.Timeout:
            case DataCacheErrorCode.RetryLater:
                if (counter > 9) return false; // we tried 10 times, so we will give up.

                counter++;
                Thread.Sleep(100);
                return TryGetFromCache(cacheKey, region, out cacheResult, counter);
            default:
                EventLog.WriteEntry(EventViewerSource, "TryGetFromCache: DataCacheException caught:\n" +
                        dataCacheException.Message, EventLogEntryType.Error);

                return false;
        }
    }
}

Then when we need to get something from the cache we do:

TryGetFromCache(key, region, out cachedMapping)

This allows us to use Try methods that encasulates the exceptions. If it returns false, we know thing is wrong with the cache and we can access SQL directly.

like image 30
Frode Stenstrøm Avatar answered Sep 22 '22 09:09

Frode Stenstrøm