Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In C#, is there a kind of a SortedList<double> that allows fast querying (with LINQ) for the nearest value?

I am looking for a structure that holds a sorted set of double values. I want to query this set to find the closest value to a specified reference value.

I have looked at the SortedList<double, double>, and it does quite well for me. However, since I do not need explicit key/value pairs. this seems to be overkill to me, and i wonder if i could do faster.

Conditions:

  • The structure is initialised only once, and does never change (no insert/deletes)
  • The amount of values is in the range of 100k.
  • The structure is queried often with new references, which must execute fast.
  • For simplicity and speed, the set's value just below of the reference may be returned, not actually the nearest value
  • I want to use LINQ for the query, if possible, for simplicity of code.
  • I want to use no 3rd party code if possible. .NET 3.5 is available.
  • Speed is more importand than memory footprint

I currently use the following code, where SortedValues is the aforementioned SortedList

    IEnumerable<double> nearest = from item in SortedValues.Keys
                                  where item <= suggestion
                                  select item;
    return nearest.ElementAt(nearest.Count() - 1);

Can I do faster?

Also I am not 100% percent sure, if this code is really safe. IEnumerable, the return type of my query is not by definition sorted anymore. However, a Unit test with a large test data base has shown that it is in practice, so this works for me. Have you hints regarding this aspect?

P.S. I know that there are many similar questions, but none actually answers my specific needs. Especially there is this one C# Data Structure Like Dictionary But Without A Value, but the questioner does just want to check the existence not find anything.

like image 494
Marcel Avatar asked Jan 14 '10 08:01

Marcel


1 Answers

The way you are doing it is incredibly slow as it must search from the beginning of the list each time giving O(n) performance.

A better way is to put the elements into a List and then sort the list. You say you don't need to change the contents once initialized, so sorting once is enough.

Then you can use List<T>.BinarySearch to find elements or to find the insertion point of an element if it doesn't already exist in the list.

From the docs:

Return Value

The zero-based index of item in the sorted List<T>, if item is found; otherwise, a negative number that is the bitwise complement of the index of the next element that is larger than item or, if there is no larger element, the bitwise complement of Count.

Once you have the insertion point, you need to check the elements on either side to see which is closest.

like image 91
Mark Byers Avatar answered Oct 12 '22 16:10

Mark Byers