Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HashSet that preserves ordering

Tags:

c#

.net

hashset

People also ask

Does a Set preserve order?

Set is an unordered collection, it doesn't maintain any order. There are few implementations of Set which maintains the order such as LinkedHashSet (It maintains the elements in insertion order).

How do you preserve insertion order in a Set?

Use LinkedHashSet if you want to maintain insertion order of elements. Use TreeSet if you want to sort the elements according to some Comparator.

Does TreeSet keep order?

TreeSet is one of the most important implementations of the SortedSet interface in Java that uses a Tree for storage. The ordering of the elements is maintained by a set using their natural ordering whether or not an explicit comparator is provided.

Why insertion order is not preserved in HashSet?

Because in HashSet there is a hash value calculated for each object and this hash value determines the array index of the particular object in the container. So the order of inserted elements are naturally not preserved.


Standard .NET HashSet do not preserve the insertion order. For simple tests the insertion order may be preserved due to an accident, but it's not guaranteed and would not always work that way. To prove that it is enough to do some removals in between.

See this question for more information on that: Does HashSet preserve insertion order?

I have briefly implemented a HashSet which guarantees insertion order. It uses the Dictionary to look up items and the LinkedList to preserve order. All three insertion, removal and lookup work still in O(1).

public class OrderedSet<T> : ICollection<T>
{
    private readonly IDictionary<T, LinkedListNode<T>> m_Dictionary;
    private readonly LinkedList<T> m_LinkedList;

    public OrderedSet()
        : this(EqualityComparer<T>.Default)
    {
    }

    public OrderedSet(IEqualityComparer<T> comparer)
    {
        m_Dictionary = new Dictionary<T, LinkedListNode<T>>(comparer);
        m_LinkedList = new LinkedList<T>();
    }

    public int Count => m_Dictionary.Count;

    public virtual bool IsReadOnly => m_Dictionary.IsReadOnly;

    void ICollection<T>.Add(T item)
    {
        Add(item);
    }

    public bool Add(T item)
    {
        if (m_Dictionary.ContainsKey(item)) return false;
        var node = m_LinkedList.AddLast(item);
        m_Dictionary.Add(item, node);
        return true;
    }

    public void Clear()
    {
        m_LinkedList.Clear();
        m_Dictionary.Clear();
    }

    public bool Remove(T item)
    {
        if (item == null) return false;
        var found = m_Dictionary.TryGetValue(item, out var node);
        if (!found) return false;
        m_Dictionary.Remove(item);
        m_LinkedList.Remove(node);
        return true;
    }

    public IEnumerator<T> GetEnumerator()
    {
        return m_LinkedList.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }

    public bool Contains(T item)
    {
        return item != null && m_Dictionary.ContainsKey(item);
    }

    public void CopyTo(T[] array, int arrayIndex)
    {
        m_LinkedList.CopyTo(array, arrayIndex);
    }
}

You can get this functionality easily using KeyedCollection<TKey,TItem> specifying the same type argument for TKey and TItem:

public class OrderedHashSet<T> : KeyedCollection<T, T>
{
    protected override T GetKeyForItem(T item)
    {
        return item;
    }
}

If you need constant complexity of Add, Remove, Contains and order preservation, then there's no such collection in .NET Framework 4.5.

If you're okay with 3rd party code, take a look at my repository (permissive MIT license): https://github.com/OndrejPetrzilka/Rock.Collections

There's OrderedHashSet<T> collection:

  • based on classic HashSet<T> source code (from .NET Core)
  • preserves order of insertions and allows manual reordering
  • features reversed enumeration
  • has same operation complexities as HashSet<T>
  • Add and Remove operations are 20% slower compared to HashSet<T>
  • consumes 8 more bytes of memory per item