Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why would I use a HashSet over a Dictionary?

I'm trying to implement a list of cached paths on a A* algorithm. Currently, the cached paths are stored in a list like this:

readonly List<CachedPath> _cachedPaths = new List<CachedPath>();

The operations performed over this list are:

FirstOrDefault to get an element that satisfies certain conditions

var cached = _cachedPaths.FirstOrDefault(p => p.From == from && p.To == target && p.Actor == self);

Remove and element

_cachedPaths.Remove(cached);

Additions

_cachedPaths.Add(new CachedPath {
                    From = from,
                    To = target,
                    Actor = self,
                    Result = pb,
                    Tick = _world.WorldTick
                });

NOTE: The class CachedPath has GetHashCode and Equals overriden by just the From, To and Actor, so two instances that have these same attributes have the same hash and equality.

Given that quick lookups (Contains), insertions and deletions in a 'HashSet' are O(1) (if I'm not mistaken), I considered using a 'HashSet' to do these operations. The only problem is the FirstOrDefault, that I had to enumerate the whole collection to get it.

Given this problem, I considered also using a Dictionary indexed by the hash of From, To and Actor:

Dictionary<int, CachedPath> cachedPath

Once again, if I'm not mistaken, Dictionary also offers O(1) in insertions, deletions, and also retrieval by Key. This leads me to think that a Dictionary is a HashSet + O(1) element retrieval capabilities.

Am I missing something? Is really Dictionary better than HashSet in the sense that it supports more operations?

Thanks in advance.

like image 276
David Jiménez Martínez Avatar asked Jan 18 '15 11:01

David Jiménez Martínez


People also ask

Which is faster HashSet or Dictionary?

The List type is the clear winner here, and no surprise really given that both HashSet and Dictionary ensures that that are no duplicated, what's surprising though is how much more overhead you incur when dealing with reference types! They say hash lookups are fast and it's no lie!

What is the difference between HashSet and Dictionary?

A HashSet, similar to a Dictionary, is a hash-based collection, so look ups are very fast with O(1). But unlike a dictionary, it doesn't store key/value pairs; it only stores values. So, every objects should be unique and this is determined by the value returned from the GetHashCode method.

When should you use a HashSet?

A HashSet is usually used for high-performance operations involving a set of unique data. Since HashSet contains only unique elements, its internal structure is optimized for faster searches. Note that you can store a single null value in a HashSet.

Which is faster hash table or Dictionary?

Dictionary is faster than hashtable as dictionary is a generic strong type. Hashtable is slower as it takes object as data type which leads to boxing and unboxing.


2 Answers

Dictionary is not better than HashSet, it's just different.

  • You use a HashSet when you want to store an unordered collection of items, and
  • You use a Dictionary when you want to associate a set of items called "keys" with another collection of items called "values"

One could think of a HashSet as a Dictionary with no associated values (in fact, HashSet is sometimes implemented using a Dictionary behind the scene) but it is not necessary to think about it in this way: thinking of the two as of entirely different things works fine, too.

In your case you could potentially improve performance by making a dictionary by actor, like this:

Dictionary<ActorType,List<CachedPath>> _cachedPathsByActor

This way your linear search would quickly choose a sub-list based on an actor, and then search linearly by target:

var cached = _cachedPathsByActor[self].FirstOrDefault(p => p.From == from && p.To == target);

or by making an equality comparer that considers all three items, and using a Dictionary with CachedPath as both keys and values, and that custom IEqualityComparer<T> as the key comparer:

class CachedPathEqualityComparer : IEqualityComparer<CachedPath> {
    public bool Equals(CachedPath a, CachedPath b) {
        return a.Actor == b.Actor
            && a.From == b.From
            && a.To == b.To;
    }
    public int GetHashCode(CachedPath p) {
        return 31*31*p.Actor.GetHashCode()+31*p.From.GetHashCode()+p.To.GetHashCode();
    }
}
...
var _cachedPaths = new Dictionary<CachedPath,CachedPath>(new CachedPathEqualityComparer());
...
CachedPath cached;
if (_cachedPaths.TryGetValue(self, out cached)) {
    ...
}

However, this approach assumes that there would be at most one item in the dictionary with identical From, To, and Actor.

like image 108
Sergey Kalinichenko Avatar answered Oct 09 '22 00:10

Sergey Kalinichenko


A hashset will not throw an exception when performing an add. Instead it returns a bool reflecting success of the add.

Also a hashset does not require a keyValue pair. I use hashsets to guarantee a collection of unique values.

like image 23
Scott Nimrod Avatar answered Oct 09 '22 00:10

Scott Nimrod