I'm doing some benchmark on several string dictionaries, because I wanted to find out how performs the different configurations, and I got surprised that the normal Dictionary<String,String>
is the faster. Probably because I'm missing some concept or I'm doing something wrong, but these are the results I got:
Collection: 2147482 items.
Random Keys: 1000 keys.
Normal dictionary
Add 1000 items: 573ms.
Get 1000 keys: 0ms.
Normal Dictionary with OrdinalIgnoreCase comparer
Add 1000 items: 642ms.
Get 1000 keys: 0ms.
Normal Dictionary with InvariantCultureIgnoreCase comparer
Add 1000 items: 1661ms.
Get 1000 keys: 0ms.
Sorted dictionary
Add 1000 items: 11996ms.
Get 1000 keys: 5ms.
Sorted dictionary with OrdinalIgnoreCase comparer
Add 1000 items: 11097ms.
Get 1000 keys: 5ms.
Sorted dictionary with InvariantCultureIgnoreCase comparer
Add 1000 items: 9814ms.
Get 1000 keys: 5ms.
An this is the code I'm using to test it:
static void Main(string[] args)
{
String[] col = GenerateCollectionUnsorted(Int32.MaxValue / 1000).ToArray();
Int32 len = col.Length;
Console.WriteLine("Collection:\t\t" + len.ToString() + " items.");
String[] randomKeys = GetRandomKeys(col, 1000);
Console.WriteLine("Random Keys:\t\t" + randomKeys.Length.ToString() + " keys.");
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine();
Console.WriteLine("Normal dictionary");
TestDic(col, randomKeys, new Dictionary<String, String>());
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine();
Console.WriteLine("Normal Dictionary with OrdinalIgnoreCase comparer");
TestDic(col, randomKeys, new Dictionary<String, String>(StringComparer.OrdinalIgnoreCase));
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine();
Console.WriteLine("Normal Dictionary with InvariantCultureIgnoreCase comparer");
TestDic(col, randomKeys, new Dictionary<String, String>(StringComparer.InvariantCultureIgnoreCase));
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine();
Console.WriteLine("Sorted dictionary");
TestDic(col,randomKeys, new SortedDictionary<String,String>());
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine();
Console.WriteLine("Sorted dictionary with OrdinalIgnoreCase comparer");
TestDic(col, randomKeys, new SortedDictionary<String, String>(StringComparer.OrdinalIgnoreCase));
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine();
Console.WriteLine("Sorted dictionary with InvariantCultureIgnoreCase comparer");
TestDic(col, randomKeys, new SortedDictionary<String, String>(StringComparer.InvariantCultureIgnoreCase));
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine("\nEnd");
Console.ReadKey(true);
}
static void TestDic(String[] col, String[] randomKeys, IDictionary<String,String> dic)
{
Stopwatch crono = new Stopwatch();
crono.Start();
foreach (String s in col)
dic.Add(s, s);
crono.Stop();
Console.WriteLine("\tAdd "+randomKeys.Length.ToString()+" items:\t\t" + crono.ElapsedMilliseconds.ToString() + "ms.");
crono.Reset();
crono.Start();
String sv;
foreach (var rk in randomKeys)
{
sv = dic[rk];
sv.ToString();
}
crono.Stop();
Console.WriteLine("\tGet " + randomKeys.Length.ToString() + " keys:\t\t" + crono.ElapsedMilliseconds.ToString() + "ms.");
}
static String[] GetRandomKeys(String[] col, Int32 count)
{
Random ran = new Random(DateTime.Now.Millisecond);
List<Int32> indexSelection = new List<int>();
List<String> selection = new List<string>();
Int32 len = col.Length;
while (indexSelection.Count < count)
{
Int32 value = ran.Next(0, len - 1);
if (!indexSelection.Contains(value))
{
indexSelection.Add(value);
selection.Add(col[value]);
}
}
return selection.ToArray();
}
static IEnumerable<String> GenerateCollection(Int32 count)
{
for (int i = 0; i < count; i++)
{
yield return i.ToString("X").PadLeft(5, '0');
}
}
static IEnumerable<String> GenerateCollectionUnsorted(Int32 amount)
{
for (int i = 0; i < amount / 2; i++)
{
yield return i.ToString("X").PadLeft(5, '0');
yield return (amount-i).ToString("X").PadLeft(5, '0');
}
}
Any idea why is this happening?
EDIT 1: It was my understanding that sorted dictionary would be slower adding items, and faster getting them because the collection is sorted. And also that comparing strings with a OrdinalIgnoreCase or InvariantCultureIgnoreCase should be faster that a normal comparision, so the seek should be faster. But maybe my understanding is completely wrong :D
Thanks in advance.
EDIT 2: As curiosity, I've made some test on String comparisons:
Collection: 2147482 items.
Random Keys: 1000 keys.
CurrentCulture
LookUp: 158209ms.
CurrentCultureIgnoreCase
LookUp: 160710ms.
InvariantCulture
LookUp: 132765ms.
InvariantCultureIgnoreCase
LookUp: 133409ms.
Ordinal
LookUp: 36115ms.
OrdinalIgnoreCase
LookUp: 36329ms.
Because only the normal dictionary with the default comparer is actually a normal dictionary, all the others are sorted dictionaries.
So, the result is very consistent.
With that fixed, the result is different. :)
A sorted dictionary is slower when adding items as they need to be sorted, but it's not faster when getting items. Searching the binary tree is fast, but searching the hash list that a regular dictionary uses is faster. When the binary tree grows, there will be more steps to locate each item, while a dictionary mostly grows by adding more buckets so the number of compares is barely affected.
Comparing strings with Ordinal
is faster than regular (CurrentCulture
) compare, and OrdinalIgnoreCase
is faster than CurrentCultureIgnoreCase
, but it's not certain that OrdinalIgnoreCase
is faster than CurrentCulture
. InvariantCulture
compare is not faster than a regular compare, it just uses a different culture. The reason that ordinal compare is a lot faster is that it doesn't have to bother with cultural settings at all.
By the way, I noticed an error in the GetRandomKeys method. It will never pick the last item, as you are getting a random number between 0 and Length - 2.
From the MSDN documentation on SortedDictionary
:
The
SortedDictionary<TKey, TValue>
generic class is a binary search tree with O(log n) retrieval, where n is the number of elements in the dictionary.
From the docs on Dictionary
:
Retrieving a value by using its key is very fast, close to O(1), because the
Dictionary<TKey, TValue>
class is implemented as a hash table.
Therefore it should not be surprising that Dictionary
can outperform SortedDictionary
in many cases.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With