Could anyone explain me why the generics List.Contains()
function is so slow?
I have a List<long>
with about a million numbers, and the code that is constantly checking if there's a specific number within these numbers.
I tried doing the same thing using Dictionary<long, byte>
and the Dictionary.ContainsKey()
function, and it was about 10-20 times faster than with the List.
Of course, I don't really want to use Dictionary for that purpose, because it wasn't meant to be used that way.
So, the real question here is, is there any alternative to the List<T>.Contains()
, but not as whacky as Dictionary<K,V>.ContainsKey()
?
If you are just checking for existence, HashSet<T>
in .NET 3.5 is your best option - dictionary-like performance, but no key/value pair - just the values:
HashSet<int> data = new HashSet<int>(); for (int i = 0; i < 1000000; i++) { data.Add(rand.Next(50000000)); } bool contains = data.Contains(1234567); // etc
List.Contains is a O(n) operation.
Dictionary.ContainsKey is a O(1) operation, since it uses the hashcode of the objects as a key, which gives you a quicker search ability.
I don't think that it 's a good idea to scan through a List which contains a million entries to find a few entries.
Isn't it possible to save those millon entities into a RDBMS for instance, and perform queries on that database ?
If it is not possible, then I would use a Dictionary anyway if you want to do key-lookups.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With