Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should one prefer ImmutableDictionary, or ImmutableSortedDictionary?

I have heard that the .NET System.Collections.Immutable collections are implemented as balanced binary trees in order to satisfy their immutability constraints, even collections which traditionally model hash tables like Dictionary, by using the integral value of GetHashCode as a sort key.

If I have a type for which it is cheap to generate a hash code, and for which is cheap to compare (e.g. string or int), and I don't care about the sorted-ness of my collection, would it make sense to prefer ImmutableSortedDictionary because the underlying data structure is sorted anyway?

like image 486
Billy ONeal Avatar asked Apr 08 '15 18:04

Billy ONeal


2 Answers

The answer is yes, it can make sense to prefer ImmutableSortedDictionary in certain conditions, for instance with Int32 keys.

In my case, with Int32 keys I found out that ImmutableSortedDictionary was a better pick.

I have run a small benchmark using 1 million items:

  • Insert 1,000,000 items in ascending order of key
  • Update 1,000,000 random items
  • Scan 1,000,000 items, i.e. iterate once over each item in the collection
  • Read 1,000,000 random items
  • Delete 1,000,000 random items

ImmutableDictionary<int, object>

Insert: 2499 ms
Update: 7275 ms
Scan:    385 ms
Read:    881 ms
Delete: 5037 ms

ImmutableSortedDictionary<int, object>

Insert: 1808 ms
Update: 4928 ms
Scan:    246 ms
Read:    732 ms
Delete: 3522 ms

ImmutableSortedDictionary is a bit faster than ImmutableDictionary on all operations. Note that insertion was done one item at a time in ascending order of key (because it happens to match my particular use case).

However, you should also consider using a mutable collection with some locking. Writing to a mutable Dictionary<int, object> is one order of magnitude faster.

like image 114
ZunTzu Avatar answered Nov 02 '22 14:11

ZunTzu


A hash-based collection should be significantly faster on .NET because:

  1. It can use a more efficient search tree specialized for int keys such as a hash trie or Patricia tree.

  2. Its inner loop will do almost entirely int comparisons rather than generic comparisons.

However, if you need better performance you will usually be much better off switching to a mutable collection like HashSet.

like image 38
J D Avatar answered Nov 02 '22 15:11

J D