Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

High performance caching

The following code is supposed to cache the last read. The LastValueCache is a cache that can be accessed by many threads (this is why I use the shared-memory). It is OK for me to have race conditions but I want that other threads would see the changed LastValueCache.

class Repository
{
    public Item LastValueCache
    {
        get
        {
            Thread.MemoryBarrier();
            SomeType result = field;
            Thread.MemoryBarrier();
            return result;
        }
        set
        {
            Thread.MemoryBarrier();
            field = value;
            Thread.MemoryBarrier();
        }
    }

    public void Save(Item item)
    {
        SaveToDatabase(item);
        Item cached = LastValueCache;
        if (cached == null || item.Stamp > cached.Stamp)
        {
            LastValueCache = item;
        }
    }

    public void Remove(Timestamp stamp)
    {
        RemoveFromDatabase(item);
        Item cached = LastValueCache;
        if (cached != null && cached.Stamp == item.Stamp)
        {
            LastValueCache = null;
        }
    }

    public Item Get(Timestamp stamp)
    {
        Item cached = LastValueCache;
        if (cached != null && cached.Stamp == stamp)
        {
            return cached;
        }

        return GetFromDatabase(stamp);
    }
}

The Repository object is used by many threads. I do not want to use locking because it would affect performance which in my case is more important than data consistency. The question is what smallest synchronizing mechanism that will fit my need? Maybe volatile or single MemoryBarrier in get and set would be enough?

like image 808
Pellared Avatar asked Mar 17 '14 21:03

Pellared


People also ask

What is high performance cache?

Another approach is high-performance caching. With a caching system in place, performance will increase and many of the client requests will be fulfilled by cached data, which reduces the number of requests that go against the data source production.

How does caching improve performance?

A cache's primary purpose is to increase data retrieval performance by reducing the need to access the underlying slower storage layer. Trading off capacity for speed, a cache typically stores a subset of data transiently, in contrast to databases whose data is usually complete and durable.


3 Answers

If this is stupid you don't need to vote me down.
Just tell me and I will delete.
But I am not following this logic.

public void Save(Item item)
{
    SaveToDatabase(item);
    Item cached = LastValueCache;
    if (cached == null || item.Stamp > cached.Stamp)
    {
        LastValueCache = item;
    }
}

You are worried about memory milliseconds but you are waiting on a write to the database before you update the cache.
Based on the public Item Get stamp is a key.

Lets assume a db write is 20 ms
A db read is 10 ms
Cache get and cache set are each 2 ms

public void Save(Item item)
SaveToDatabase(item); 20 ms
Item cached = LastValueCache; 2 ms
if (cached == null || item.Stamp > cached.Stamp) 1 ms
LastValueCache = item; 2 ms

During that 23 ms prior to the LastValueCache = item; any call to public Item Get(Timestamp stamp) is going to hit the DataBase and not the cache.

During the 23 ms prior to the LastValueCache = item; any call to public Item LastValueCache get is going to get a value that is stale by up 23 ms. The stated objective is for other threads to see LastValueCache - but the are seeing a stale LastValueCache.

Same thing with Remove.
You are going to have several hits to the database you could have avoided.

What are you trying to achieve?
Have you profiled this?

My bet is the bottle neck is the calls to the database.
A database call is 1000x longer than the difference between a lock and MemoryBarrier.

public void Save(Item item)
{   
   // add logic that the prior asynchonous call to SaveToDatabase is complete
   // if not wait for it to complete 
   // LastValueCache will possible be replaced so you need last item in the database 
   // the time for a lock is not really a factor as it will be faster than the prior update 
   Item cached = LastValueCache;
   if (cached == null || item.Stamp > cached.Stamp)
   {
       LastValueCache = item;
   }
   // make the next a task or background so it does not block    
   SaveToDatabase(item);
}

Could even change the logic to only wait for prior call if you setting LastValueCache = item;
But you need to throttle the database somehow

The next step would be to cache the last X and use that in Item Get(Timestamp stamp)
The database are calls are what you need to optimize
Again you need to profile

After that the logic would get more complex but feed the database calls to a BlockingCollection. Would need to be sure the last X cache is bigger than the BlockingCollections size. If not block and wait for the BC to clear. And you would need to use the same BC for insert and delete so they are processed in order. Could get smart enough that you just don't insert a records that has a delete. And don't just insert or delete a single record at a time.

like image 163
paparazzo Avatar answered Oct 21 '22 00:10

paparazzo


I implemented a thread safe pseudo LRU designed for concurrent workloads: ConcurrentLru. Performance is very close to ConcurrentDictionary, ~10x faster than MemoryCache and hit rate is better than a conventional LRU. Full analysis provided in the github link below.

Usage looks like this:

int capacity = 666;
var lru = new ConcurrentLru<int, SomeItem>(capacity);

var value = lru.GetOrAdd(1, (k) => new SomeItem(k));

GitHub: https://github.com/bitfaster/BitFaster.Caching

Install-Package BitFaster.Caching
like image 21
Alex Peck Avatar answered Oct 21 '22 00:10

Alex Peck


volatile should be sufficient and most performant for your case. There are potential scenarios where volatile alone is not sufficient for ensuring the read gets the most up-to-date value, but in this case, I don't think it would be an important factor. See the "the volatile keyword" section of this awesome article by Joe Albahari for details.

like image 29
Michael Gunter Avatar answered Oct 21 '22 01:10

Michael Gunter