Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to derive IEqualityComparer from IComparer?

TL;DR I'm looking for a way to obtain IEqualityComparer<T> from IComparer<T>, no matter which datatype is T, including case-insensitive options if T is string. Or I need a different solution for this problem.

Here's full story: I'm implementing simple, generic cache with LFU policy. Requirement is that it must be possible to select whether the cache will be case sensitive or case insensitive -- if string happens to be the datatype for cache keys (which is not necessary). In the solution I primarily develop the cache for, I expect hundreds of billions of cache lookups, and cache sizes of max 100.000 entries. Because of that numbers I immediately resigned from using any string manipulation that causes allocations (such as .ToLower().GetHashCode() etc.) and instead opted to use IComparer and IEqualityComparer, as they are standard BCL features. User of this cache can pass the comparers to constructor. Here are relevant fragments of the code:

public class LFUCache<TKey,TValue>
{
    private readonly Dictionary<TKey,CacheItem> entries;
    private readonly SortedSet<CacheItem> lfuList;

    private class CacheItem
    {
        public TKey Key;
        public TValue Value;
        public int UseCount;
    }

    private class CacheItemComparer : IComparer<CacheItem>
    {
        private readonly IComparer<TKey> cacheKeyComparer;

        public CacheItemComparer(IComparer<TKey> cacheKeyComparer)
        {
            this.cacheKeyComparer = cacheKeyComparer;
            if (cacheKeyComparer == null)
                this.cacheKeyComparer = Comparer<TKey>.Default;
        }

        public int Compare(CacheItem x, CacheItem y)
        {
            int UseCount = x.UseCount - y.UseCount;
            if (UseCount != 0) return UseCount;
            return cacheKeyComparer.Compare(x.Key, y.Key);
        }
    }

    public LFUCache(int capacity, IEqualityComparer<TKey> keyEqualityComparer,
                  IComparer<TKey> keyComparer)  // <- here's my problem
    {
        // ...
        entries = new Dictionary<TKey, CacheItem>(keyEqualityComparer);
        lfuList = new SortedSet<CacheItem>(new CacheItemComparer(keyComparer));
    }
    // ...
}

The keyEqualityComparer is used to manage cache entries (so e.g. the key "ABC" and "abc" are equal if user wants to). The keyComparer is used to manage cache entries sorted by UseCount so that it's easy to select the least frequently used one (implemented in CacheItemComparer class).

Example correct usage with custom comparison:

var cache = new LFUCache<string, int>(10000,
    StringComparer.InvariantCultureIgnoreCase,
    StringComparer.InvariantCultureIgnoreCase);

(That looks stupid, but StringComparer implements both IComparer<string> and IEqualityComparer<string>.) The problem is that if user gives incompatible comparers (i.e. case insensitive keyEqualityComparer and case sensitive keyComparer), then the most likely outcome is invalid LFU statistics, and thus lower cache hits at best. The other scenario is also less than desired. Also if the key is more sophisticated (I'll have something resembling Tuple<string,DateTime,DateTime>), it's possible to mess it up more severely.

That's why I'd like to only have a single comparer argument in constructor, but that doesn't seem to work. I'm able to create IEqualityComparer<T>.Equals() with help of IComparer<T>.Compare(), but I'm stuck at IEqualityComparer<T>.GetHashCode() -- which is very important, as you know. If I had got access to private properties of the comparer to check if it's case sensitive or not, I would have used CompareInfo to get hash code.

I like this approach with 2 different data structures, because it gives me acceptable performance and controllable memory consumption -- on my laptop around 500.000 cache additions/sec with cache size 10.000 elements. Dictionary<TKey,TValue> is just used to find data in O(1), and SortedSet<CacheItem> inserts data in O(log n), find element to remove by calling lfuList.Min in O(log n), and find the entry to increment use count also in O(log n).

Any suggestions on how to solve this are welcome. I'll appreciate any ideas, including different designs.

like image 979
Endrju Avatar asked Jun 15 '15 17:06

Endrju


1 Answers

It's not possible to implement an IComparer from an IEqualityComparer as you have no way of knowing whether an unequal item is greater than or less than the other item.

It's not possible to implement an IEqualityComparer from an IComparer as there's no way for you to generate a hash code that is in line with the IComparer's identity.

That said, there's no need for you to have both types of comparers in your case. When computing LRU you're comparing the time since an item was used as the primary comparer and then comparing based on a passed in comparer as a tiebreaker. Just remove that last part; don't have a tiebreaker. Let it be undefined which item leaves the cache when there is a tie for the least recently used. When you do that you only need to accept an IEqualityComparer, not an IComparer.

like image 176
Servy Avatar answered Sep 17 '22 11:09

Servy