Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance Benchmarking of Contains, Exists and Any

I have been searching for a performance benchmarking between Contains, Exists and Any methods available in the List<T>. I wanted to find this out just out of curiosity as I was always confused among these. Many questions on SO described definitions of these methods such as:

  1. LINQ Ring: Any() vs Contains() for Huge Collections
  2. Linq .Any VS .Exists - Whats the difference?
  3. LINQ extension methods - Any() vs. Where() vs. Exists()

So I decided to do it myself. I am adding it as an answer. Any more insight on the results is most welcomed. I also did this benchmarking for arrays to see the results

like image 833
harshit Avatar asked Sep 06 '13 07:09

harshit


People also ask

Where exists vs any?

Answers. Basically both does same job. Exists method exists since long time. Any is a recent method.

What is any () in C#?

The Any method checks whether any of the element in a sequence satisfy a specific condition or not. If any element satisfy the condition, true is returned.

Is array faster than list C#?

In general, one would opt for using Lists (List) due to their flexibility in size. On top of that, msdn documentation claims Lists use an array internally and should perform just as fast (a quick look with Reflector confirms this).


1 Answers

The fastest way is to use a HashSet. The Contains for a HashSet is O(1).

I took your code and added a benchmark for HashSet<int> The performance cost of HashSet<int> set = new HashSet<int>(list); is nearly zero.

Code

void Main() {     ContainsExistsAnyVeryShort();          ContainsExistsAnyShort();      ContainsExistsAny(); }  private static void ContainsExistsAny() {     Console.WriteLine("***************************************");     Console.WriteLine("********* ContainsExistsAny ***********");     Console.WriteLine("***************************************");      List<int> list = new List<int>(6000000);     Random random = new Random();     for (int i = 0; i < 6000000; i++)     {         list.Add(random.Next(6000000));     }     int[] arr = list.ToArray();     HashSet<int> set = new HashSet<int>(list);      find(list, arr, set, (method, stopwatch) => $"{method}: {stopwatch.ElapsedMilliseconds}ms");  }  private static void ContainsExistsAnyShort() {     Console.WriteLine("***************************************");     Console.WriteLine("***** ContainsExistsAnyShortRange *****");     Console.WriteLine("***************************************");      List<int> list = new List<int>(2000);     Random random = new Random();     for (int i = 0; i < 2000; i++)     {         list.Add(random.Next(6000000));     }     int[] arr = list.ToArray();     HashSet<int> set = new HashSet<int>(list);      find(list, arr, set, (method, stopwatch) => $"{method}: {stopwatch.ElapsedMilliseconds}ms");  }  private static void ContainsExistsAnyVeryShort() {     Console.WriteLine("*******************************************");     Console.WriteLine("***** ContainsExistsAnyVeryShortRange *****");     Console.WriteLine("*******************************************");      List<int> list = new List<int>(10);     Random random = new Random();     for (int i = 0; i < 10; i++)     {         list.Add(random.Next(6000000));     }     int[] arr = list.ToArray();     HashSet<int> set = new HashSet<int>(list);      find(list, arr, set, (method, stopwatch) => $"{method}: {stopwatch.ElapsedTicks} ticks");  }  private static void find(List<int> list, int[] arr, HashSet<int> set, Func<string, Stopwatch, string> format) {     Random random = new Random();     int[] find = new int[10000];     for (int i = 0; i < 10000; i++)     {         find[i] = random.Next(6000000);     }      Stopwatch watch = Stopwatch.StartNew();     for (int rpt = 0; rpt < 10000; rpt++)     {         list.Contains(find[rpt]);     }     watch.Stop();     Console.WriteLine(format("List/Contains", watch));      watch = Stopwatch.StartNew();     for (int rpt = 0; rpt < 10000; rpt++)     {         list.Exists(a => a == find[rpt]);     }     watch.Stop();     Console.WriteLine(format("List/Exists", watch));      watch = Stopwatch.StartNew();     for (int rpt = 0; rpt < 10000; rpt++)     {         list.Any(a => a == find[rpt]);     }     watch.Stop();     Console.WriteLine(format("List/Any", watch));      watch = Stopwatch.StartNew();     for (int rpt = 0; rpt < 10000; rpt++)     {         arr.Contains(find[rpt]);     }     watch.Stop();     Console.WriteLine(format("Array/Contains", watch));      watch = Stopwatch.StartNew();     for (int rpt = 0; rpt < 10000; rpt++)     {         Array.Exists(arr, a => a == find[rpt]);     }     watch.Stop();     Console.WriteLine(format("Array/Exists", watch));      watch = Stopwatch.StartNew();     for (int rpt = 0; rpt < 10000; rpt++)     {         Array.IndexOf(arr, find[rpt]);     }     watch.Stop();     Console.WriteLine(format("Array/IndexOf", watch));      watch = Stopwatch.StartNew();     for (int rpt = 0; rpt < 10000; rpt++)     {         arr.Any(a => a == find[rpt]);     }     watch.Stop();     Console.WriteLine(format("Array/Any", watch));      watch = Stopwatch.StartNew();     for (int rpt = 0; rpt < 10000; rpt++)     {         set.Contains(find[rpt]);     }     watch.Stop();     Console.WriteLine(format("HashSet/Contains", watch)); } 

RESULTS

******************************************* ***** ContainsExistsAnyVeryShortRange ***** ******************************************* List/Contains: 1067 ticks List/Exists: 2884 ticks List/Any: 10520 ticks Array/Contains: 1880 ticks Array/Exists: 5526 ticks Array/IndexOf: 601 ticks Array/Any: 13295 ticks HashSet/Contains: 6629 ticks *************************************** ***** ContainsExistsAnyShortRange ***** *************************************** List/Contains: 4ms List/Exists: 28ms List/Any: 138ms Array/Contains: 6ms Array/Exists: 34ms Array/IndexOf: 3ms Array/Any: 96ms HashSet/Contains: 0ms *************************************** ********* ContainsExistsAny *********** *************************************** List/Contains: 11504ms List/Exists: 57084ms List/Any: 257659ms Array/Contains: 11643ms Array/Exists: 52477ms Array/IndexOf: 11741ms Array/Any: 194184ms HashSet/Contains: 3ms 

Edit (2021-08-25)

I added a comparison for very short collections (10 items) and also added Array.Contains and Array.IndexOf. You can see that Array.IndexOf is the fastest for such small ranges. That is, because as @lucky-brian said n is so small here, that a for-loop performs better than a somewhat complex search algorithm. However I still advice to use HashSet<T> whenever possible, as it better reflects the intend of only having unique values and the difference for small collections is below 1ms.

like image 142
wertzui Avatar answered Sep 24 '22 03:09

wertzui