I have an IDictionary<TKey,TValue>
implementation that internally holds n other Dictionary<TKey, TValue>
and distributes that insertions by the HashCode of the key to the invidual sub-dictionaries. With 16 sub-dictionaries, the number of collisions is pretty low on a 4-core machine.
For parallel insertions, i locked the Add-method with a ReaderWriterLockSlim
, locking only the individual sub-dictionary:
public void Add(TKey key, TValue value)
{
int poolIndex = GetPoolIndex(key);
this.locks[poolIndex].EnterWriteLock();
try
{
this.pools[poolIndex].Add(key, value);
}
finally
{
this.locks[poolIndex].ExitWriteLock();
}
}
When inserting items with four threads, i only got about 32% cpu usage and bad performance. So i replaced the ReaderWriterLockSlim by a Monitor (i.e., the lock
keyword).
CPU usage was now at nearly 100% and the performance was more than doubled.
My question is: Why did the CPU usage increase? The number of collisions should not have changed. What makes ReaderWriterLock.EnterWriteLock wait so many times?
For write-only load the Monitor is cheaper than ReaderWriterLockSlim, however, if you simulate read + write load where read is much greater than write, then ReaderWriterLockSlim should out perform Monitor.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With