Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the closest item to my key from a SortedDictionary?

Currently I'm using a binary search over a SortedList<T,U> for a specific number, and if it doesn't exist I get the closest lower-bound key-item instead.

I saw that it was rather slow in inserting unsorted data which I'm doing a lot of.

Is there a way to do something similar with the SortedDictionary, or should I just stick to my SortedList?

like image 996
Rusty Shackleford Avatar asked Jul 27 '12 15:07

Rusty Shackleford


1 Answers

SortedList<K, V> is really slow when inserting data as it shifts <=N elements in internal array each time a new element is added. The complexity of addition is O(N). Nevertheless it supports binary search which allows to find exact element or its neighbors in O(log N).

Balanced binary tree is the best data structure to solve your problem. You'll be able to to do the following operations w/ logarithmic complexity:

  1. Add item in O(log N) vs. O(N) in SortedList<K, V>
  2. Remove item in O(log N)
  3. Search item or its nearest in O(log N)

Looking for element or its nearest lower-bound in binary tree is simple:

  1. Go vertically through the tree from root to child in order to find your key. If key < node, then go to left child, otherwise to the right one.
  2. If you found the key, return
  3. If key not found, nearest left parent will be the one you are looking for (nearest lower-bound)
  4. If there is no left parents, just take the last visited node, it is minimal node in the tree.

There are many articles describing how to implement binary tree. Nevertheless I'm going to reuse .NET Framework collection using a kind of hack :)

Now, I'm gonna present to you SortedSet<T> which itself is red-black tree. It has one drawback, it has no ability to find nearest nodes quickly. But we know the algorithm of search in tree (it's described in 1.) and it is implemented in SortedSet<T>.Contains method (decompiled at the bottom*). Now we can capture all nodes from root to the last visited node during traversal using our custom comparer. After that we can find nearest lower-bound node using algorithm above:

public class LowerBoundSortedSet<T> : SortedSet<T> {

    private ComparerDecorator<T> _comparerDecorator;

    private class ComparerDecorator<T> : IComparer<T> {

        private IComparer<T> _comparer;

        public T LowerBound { get; private set; }

        private bool _reset = true;

        public void Reset()
        {
            _reset = true;
        }

        public ComparerDecorator(IComparer<T> comparer)
        {
            _comparer = comparer;
        }

        public int Compare(T x, T y)
        {
            int num = _comparer.Compare(x, y);
            if (_reset)
            {
                LowerBound = y;
            }
            if (num >= 0)
            {
                LowerBound = y;
                _reset = false;
            }
            return num;
        }
    }

    public LowerBoundSortedSet()
        : this(Comparer<T>.Default) {}

    public LowerBoundSortedSet(IComparer<T> comparer)
        : base(new ComparerDecorator<T>(comparer)) {
        _comparerDecorator = (ComparerDecorator<T>)this.Comparer;
    }

    public T FindLowerBound(T key)
    {
        _comparerDecorator.Reset();
        this.Contains<T>(key);
        return _comparerDecorator.LowerBound;
    }
}

You see that finding nearest node takes no more than usual search, i.e. O(log N). So, this is the fastest solution for your problem. This collection is as fast as SortedList<K, V> in finding nearest and is as fast as SortedSet<T> in addition.

What about SortedDictionary<K, V>? It is almost the same as SortedSet<T> except one thing: each key has a value. I hope you will be able to do the same with SortedDictionary<K, V>.

*Decompiled SortedSet<T>.Contains method:

public virtual bool Contains(T item)
{
  return this.FindNode(item) != null;
}

internal virtual SortedSet<T>.Node FindNode(T item)
{
  for (SortedSet<T>.Node node = this.root; node != null; {
    int num;
    node = num < 0 ? node.Left : node.Right;
  }
  )
  {
    num = this.comparer.Compare(item, node.Item);
    if (num == 0)
      return node;
  }
  return (SortedSet<T>.Node) null;
}
like image 198
Raman Zhylich Avatar answered Oct 06 '22 07:10

Raman Zhylich