Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LINQ into SortedList

I'm a complete LINQ newbie, so I don't know if my LINQ is incorrect for what I need to do or if my expectations of performance are too high.

I've got a SortedList of objects, keyed by int; SortedList as opposed to SortedDictionary because I'll be populating the collection with pre-sorted data. My task is to find either the exact key or, if there is no exact key, the one with the next higher value. If the search is too high for the list (e.g. highest key is 100, but search for 105), return null.

// The structure of this class is unimportant.  Just using
// it as an illustration.
public class CX
{
    public int KEY;
    public DateTime DT;
}

static CX getItem(int i, SortedList<int, CX> list)
{
    var items =
    (from kv in list
     where kv.Key >= i
     select kv.Key);

    if (items.Any())
    {
        return list[items.Min()];
    }

    return null;
}

Given a list of 50,000 records, calling getItem 500 times takes about a second and a half. Calling it 50,000 times takes over 2 minutes. This performance seems very poor. Is my LINQ bad? Am I expecting too much? Should I be rolling my own binary search function?

like image 216
Chris Simmons Avatar asked May 24 '10 14:05

Chris Simmons


3 Answers

First, your query is being evaluated twice (once for Any, and once for Min). Second, Min requires that it iterate over the entire list, even though the fact that it's sorted means that the first item will be the minimum. You should be able to change this:

if (items.Any())
{
    return list[items.Min()];
}

To this:

var default = 
    (from kv in list
     where kv.Key >= i
     select (int?)kv.Key).FirstOrDefault();

if(default != null) return list[default.Value];

return null;

UPDATE

Because you're selecting a value type, FirstOrDefault doesn't return a nullable object. I have altered your query to cast the selected value to an int? instead, allowing the resulting value to be checked for null. I would advocate this over using ContainsKey, as that would return true if your list contained a value for 0. For example, say you have the following values

0 2 4 6 8

If you were to pass in anything less than or equal to 8, then you would get the correct value. However, if you were to pass in 9, you would get 0 (default(int)), which is in the list but isn't a valid result.

like image 99
Adam Robinson Avatar answered Nov 09 '22 02:11

Adam Robinson


Writing a binary search on your own can be tough.

Fortunately, Microsoft already wrote a pretty robust one: Array.BinarySearch<T>. This is, in fact, the method that SortedList<TKey, TValue>.IndexOfKey uses internally. Only problem is, it takes a T[] argument, instead of any IList<T> (like SortedList<TKey, TValue>.Keys).

You know what, though? There's this great tool called Reflector that lets you look at .NET source code...

Check it out: a generic BinarySearch extension method on IList<T>, taken straight from the reflected code of Microsoft's Array.BinarySearch<T> implementation.

public static int BinarySearch<T>(this IList<T> list, int index, int length, T value, IComparer<T> comparer) {
    if (list == null)
        throw new ArgumentNullException("list");
    else if (index < 0 || length < 0)
        throw new ArgumentOutOfRangeException((index < 0) ? "index" : "length");
    else if (list.Count - index < length)
        throw new ArgumentException();

    int lower = index;
    int upper = (index + length) - 1;

    while (lower <= upper) {
        int adjustedIndex = lower + ((upper - lower) >> 1);
        int comparison = comparer.Compare(list[adjustedIndex], value);
        if (comparison == 0)
            return adjustedIndex;
        else if (comparison < 0)
            lower = adjustedIndex + 1;
        else
            upper = adjustedIndex - 1;
    }

    return ~lower;
}

public static int BinarySearch<T>(this IList<T> list, T value, IComparer<T> comparer) {
    return list.BinarySearch(0, list.Count, value, comparer);
}

public static int BinarySearch<T>(this IList<T> list, T value) where T : IComparable<T> {
    return list.BinarySearch(value, Comparer<T>.Default);
}

This will let you call list.Keys.BinarySearch and get the negative bitwise complement of the index you want in case the desired key isn't found (the below is taken basically straight from tzaman's answer):

int index = list.Keys.BinarySearch(i);
if (index < 0)
    index = ~index;
var item = index < list.Count ? list[list.Keys[index]] : null;
return item;
like image 44
Dan Tao Avatar answered Nov 09 '22 03:11

Dan Tao


Using LINQ on a SortedList will not give you the benefit of the sort.

For optimal performance, you should write your own binary search.

like image 3
SLaks Avatar answered Nov 09 '22 03:11

SLaks