It's clear that a search performance of the generic <code>HashSet<T></code> class is higher than of the generic <code>List<T></code> class. Just compare the hash-based key with the linear approach in the <code>List<T></code> class. However calculating a hash key may itself take some CPU cycles, so for a small amount of items the linear search can be a real alternative to the <code>HashSet<T></code>. My question: where is the break-even? To simplify the scenario (and to be fair) let's assume that the <code>List<T></code> class uses the element's <code>Equals()</code> method to identify an item.

It's essentially pointless to compare two structures for performance that behave differently. Use the structure that conveys the intent. Even if you say your <code>List<T></code> wouldn't have duplicates and iteration order doesn't matter making it comparable to a <code>HashSet<T></code>, its still a poor choice to use <code>List<T></code> because its relatively less fault tolerant. That said, I will inspect some other aspects of performance, <pre class="prettyprint"><code>+------------+--------+-------------+-----------+----------+----------+-----------+ | Collection | Random | Containment | Insertion | Addition | Removal | Memory | | | access | | | | | | +------------+--------+-------------+-----------+----------+----------+-----------+ | List<T> | O(1) | O(n) | O(n) | O(1)* | O(n) | Lesser | | HashSet<T> | O(n) | O(1) | n/a | O(1) | O(1) | Greater** | +------------+--------+-------------+-----------+----------+----------+-----------+ </code></pre> <ul> <li>Even though addition is O(1) in both cases, it will be relatively slower in HashSet since it involves cost of precomputing hash code before storing it.</li> <li>The superior scalability of HashSet has a memory cost. Every entry is stored as a new object along with its hash code. This article might give you an idea.</li> </ul>

HashSet vs. List performance

2 Answers

A lot of people are saying that once you get to the size where speed is actually a concern that HashSet<T> will always beat List<T>, but that depends on what you are doing.

Let's say you have a List<T> that will only ever have on average 5 items in it. Over a large number of cycles, if a single item is added or removed each cycle, you may well be better off using a List<T>.

I did a test for this on my machine, and, well, it has to be very very small to get an advantage from List<T>. For a list of short strings, the advantage went away after size 5, for objects after size 20.

1 item LIST strs time: 617ms 1 item HASHSET strs time: 1332ms  2 item LIST strs time: 781ms 2 item HASHSET strs time: 1354ms  3 item LIST strs time: 950ms 3 item HASHSET strs time: 1405ms  4 item LIST strs time: 1126ms 4 item HASHSET strs time: 1441ms  5 item LIST strs time: 1370ms 5 item HASHSET strs time: 1452ms  6 item LIST strs time: 1481ms 6 item HASHSET strs time: 1418ms  7 item LIST strs time: 1581ms 7 item HASHSET strs time: 1464ms  8 item LIST strs time: 1726ms 8 item HASHSET strs time: 1398ms  9 item LIST strs time: 1901ms 9 item HASHSET strs time: 1433ms  1 item LIST objs time: 614ms 1 item HASHSET objs time: 1993ms  4 item LIST objs time: 837ms 4 item HASHSET objs time: 1914ms  7 item LIST objs time: 1070ms 7 item HASHSET objs time: 1900ms  10 item LIST objs time: 1267ms 10 item HASHSET objs time: 1904ms  13 item LIST objs time: 1494ms 13 item HASHSET objs time: 1893ms  16 item LIST objs time: 1695ms 16 item HASHSET objs time: 1879ms  19 item LIST objs time: 1902ms 19 item HASHSET objs time: 1950ms  22 item LIST objs time: 2136ms 22 item HASHSET objs time: 1893ms  25 item LIST objs time: 2357ms 25 item HASHSET objs time: 1826ms  28 item LIST objs time: 2555ms 28 item HASHSET objs time: 1865ms  31 item LIST objs time: 2755ms 31 item HASHSET objs time: 1963ms  34 item LIST objs time: 3025ms 34 item HASHSET objs time: 1874ms  37 item LIST objs time: 3195ms 37 item HASHSET objs time: 1958ms  40 item LIST objs time: 3401ms 40 item HASHSET objs time: 1855ms  43 item LIST objs time: 3618ms 43 item HASHSET objs time: 1869ms  46 item LIST objs time: 3883ms 46 item HASHSET objs time: 2046ms  49 item LIST objs time: 4218ms 49 item HASHSET objs time: 1873ms

Here is that data displayed as a graph:

enter image description here

Here's the code:

static void Main(string[] args) {     int times = 10000000;       for (int listSize = 1; listSize < 10; listSize++)     {         List<string> list = new List<string>();         HashSet<string> hashset = new HashSet<string>();          for (int i = 0; i < listSize; i++)         {             list.Add("string" + i.ToString());             hashset.Add("string" + i.ToString());         }          Stopwatch timer = new Stopwatch();         timer.Start();         for (int i = 0; i < times; i++)         {             list.Remove("string0");             list.Add("string0");         }         timer.Stop();         Console.WriteLine(listSize.ToString() + " item LIST strs time: " + timer.ElapsedMilliseconds.ToString() + "ms");           timer = new Stopwatch();         timer.Start();         for (int i = 0; i < times; i++)         {             hashset.Remove("string0");             hashset.Add("string0");         }         timer.Stop();         Console.WriteLine(listSize.ToString() + " item HASHSET strs time: " + timer.ElapsedMilliseconds.ToString() + "ms");         Console.WriteLine();     }       for (int listSize = 1; listSize < 50; listSize+=3)     {         List<object> list = new List<object>();         HashSet<object> hashset = new HashSet<object>();          for (int i = 0; i < listSize; i++)         {             list.Add(new object());             hashset.Add(new object());         }          object objToAddRem = list[0];          Stopwatch timer = new Stopwatch();         timer.Start();         for (int i = 0; i < times; i++)         {             list.Remove(objToAddRem);             list.Add(objToAddRem);         }         timer.Stop();         Console.WriteLine(listSize.ToString() + " item LIST objs time: " + timer.ElapsedMilliseconds.ToString() + "ms");            timer = new Stopwatch();         timer.Start();         for (int i = 0; i < times; i++)         {             hashset.Remove(objToAddRem);             hashset.Add(objToAddRem);         }         timer.Stop();         Console.WriteLine(listSize.ToString() + " item HASHSET objs time: " + timer.ElapsedMilliseconds.ToString() + "ms");         Console.WriteLine();     }      Console.ReadLine(); }

answered Sep 22 '22 16:09

innominate227

It's essentially pointless to compare two structures for performance that behave differently. Use the structure that conveys the intent. Even if you say your List<T> wouldn't have duplicates and iteration order doesn't matter making it comparable to a HashSet<T>, its still a poor choice to use List<T> because its relatively less fault tolerant.

That said, I will inspect some other aspects of performance,

+------------+--------+-------------+-----------+----------+----------+-----------+ | Collection | Random | Containment | Insertion | Addition |  Removal | Memory    | |            | access |             |           |          |          |           | +------------+--------+-------------+-----------+----------+----------+-----------+ | List<T>    | O(1)   | O(n)        | O(n)      | O(1)*    | O(n)     | Lesser    | | HashSet<T> | O(n)   | O(1)        | n/a       | O(1)     | O(1)     | Greater** | +------------+--------+-------------+-----------+----------+----------+-----------+

Even though addition is O(1) in both cases, it will be relatively slower in HashSet since it involves cost of precomputing hash code before storing it.
The superior scalability of HashSet has a memory cost. Every entry is stored as a new object along with its hash code. This article might give you an idea.

answered Sep 24 '22 16:09

nawfal

Related questions
                            
                                Storing WPF Image Resources
                            
                                Best way to read a large file into a byte array in C#?
                            
                                What is the difference between IQueryable<T> and IEnumerable<T>?
                            
                                How is Math.Pow() implemented in .NET Framework?
                            
                                Better way to check if a Path is a File or a Directory?
                            
                                Passing arguments to C# generic new() of templated type
                            
                                Where Is Machine.Config?
                            
                                Why does Math.Round(2.5) return 2 instead of 3?
                            
                                A generic list of anonymous class
                            
                                .NET / C# - Convert char[] to string
                            
                                How do I truncate a .NET string?
                            
                                If strings are immutable in .NET, then why does Substring take O(n) time?
                            
                                Replacing .NET WebBrowser control with a better browser, like Chrome? [closed]
                            
                                ArrayList vs List<> in C#
                            
                                What is the syntax for an inner join in LINQ to SQL?
                            
                                Difference between Math.Floor() and Math.Truncate()
                            
                                Asynchronously wait for Task<T> to complete with timeout
                            
                                Deserialize JSON object into dynamic object using Json.net
                            
                                What is the difference between Nullable<T>.HasValue or Nullable<T> != null?
                            
                                Using async/await for multiple tasks

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

HashSet vs. List performance

Tags:

performance

.net

list

collections

hash

Michael Damatov

People also ask

2 Answers

innominate227

nawfal

Recent Activity

Donate For Us