I have a method that uses recursion to traverse a tree and update the items.
Currently the method takes pretty long to process all the items, so i started optimizing things. Among those things is the use of a dictionary instead of executing a DB query for each item.
The dictionary is defined as
System.Collections.Generic.Dictionary<EffectivePermissionKey, MyData>
The key type is defined as
private struct EffectivePermissionKey
{
// http://blog.martindoms.com/2011/01/03/c-tip-override-equals-on-value-types-for-better-performance/
public override bool Equals(object aObject)
{
if (aObject == null)
return false;
else
return aObject is EffectivePermissionKey && Equals((EffectivePermissionKey)aObject);
}
public bool Equals(EffectivePermissionKey aObject)
{
return this.ID == aObject.ID && this.OrchardUserID == aObject.OrchardUserID;
}
public override int GetHashCode()
{
// http://stackoverflow.com/a/32502294/3936440
return unchecked(ID.GetHashCode() * 23 * 23 + OrchardUserID.GetHashCode() * 23);
}
public int ID;
public int OrchardUserID;
}
When the method runs it takes around 5000 recursions to update all items.
Initially it took around 100 seconds without the dictionary.
A first approach with DB queries replaced by the use of a dictionary with int
keys took 22 seconds.
Now, with DB queries replaced by the use of the dictionary defined above and proper TryGetValue()
calls it takes 97 seconds <- WAT.
What is going on here? What could cause this massive performance drop?
Edit
At first, it seemed like a hash collision issue to me, so i added a breakpoint in EffectivePermissionKey.Equals()
to verify that this method is called but it's not called, therefore no hash collision i guess.
Edit2
Now i'm confused. I thought Equals()
only gets called when the hash code does not match. After printing out the hash codes of my keys and the keys used in TryGetValue()
i see that these codes match. Then i looked at the source code of Dictionary<>
and there's a line in FindEntry()
that looks like this:
if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) return i;
This means that for each item key in the dictionary the GetHashCode()
and Equals()
gets called because i process all items in the dictionary as the items are the result of the DB query whereas these results where processed before the dictionary approach anyway.
Nevermind folks, sorry for taking your time, my approach was completely wrong. Let me tell why.
The issue broken down for simplicity:
A -> recursion 1, DB query for permission of node A with ID = 1
B -> recursion 2, DB query for permission of node B with ID = 2
C -> recursion 3, DB query for permission of node C with ID = 3
D -> recursion 4, DB query for permission of node D with ID = 4
As you can see, one DB query per tree node.
Now the flawed approach for optimizing this:
Dictionary<int, PermissionData> myMap
...
DB query of all permissions and insert into myMap
...
A -> recursion 1, myMap.TryGetValue(1, out ...)
B -> recursion 2, myMap.TryGetValue(2, out ...)
C -> recursion 3, myMap.TryGetValue(3, out ...)
D -> recursion 4, myMap.TryGetValue(4, out ...)
You see now that the query is done once but on each node aTryGetValue()
call is made.
In my specific case this is actually slower as executing single queries because
and
each TryGetValue()
requires / results in
TryGetValue()
Equals()
These 4 steps are executed around 5000 times versus executing 5000 simple entity framework queries (SELECT * FROM table WHERE ID = ...
). I don't know why, but the queries are faster here, perhaps the compiler optimizes something away.
Anyway, i reworked the whole thing and now i have an outer loop over User IDs and an inner recursive traversal witch uses dictionaries with a simple int key (node ID). It gives me lighting fast results. The whole execution now takes around 16 seconds and with a few more tweaks and threading i got it down to under 1 second. Mission accomplished.
Edit
After discussing this issue with a colleague we came to the conclusion that the performance issue is most likely caused by the prime numbers used in the hash code calculation. I used 23 x 23 x 23 but it should be something like 17 x 23 x 23 to avoid collisions but i cannot test this as the concerning code / application is not in my responsibility anymore. Btw a generic solution can be found here: https://stackoverflow.com/a/763966/3936440
Edit 2
As noted by a colleague the following answer suggests to not use 17 and 23, instead use larger prime numbers, see https://stackoverflow.com/a/38281271/3936440
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With