Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ConcurrentDictionary: Proper small initial capacity

I keep running into a lack of guidance choosing proper initial capacities for ConcurrentDictionary<TKey, TValue>.

My general use case is those situations where you really want to do something like the following, but cannot:

public static class StaticCache<T>
{
    public static readonly Action CompiledExpression = ...;
}

This generic-based approach avoids a dictionary lookup, but can only be used if we always know the required type at compile time. If we only have a Type known at runtime, we can no longer use this approach. The next contender is a ConcurrentDictionary<TKey, TValue>.

The documentation states:

The default capacity (DEFAULT_CAPACITY), which represents the initial number of buckets, is a trade-off between the size of a very small dictionary and the number of resizes when constructing a large dictionary. Also, the capacity should not be divisible by a small prime number. The default capacity is 31.

My number of expected elements tends to be relatively small. Sometimes as small as 3 or 5, sometimes perhaps 15. As such:

  • The number of insertions over the lifetime of the application will be extremely minimal, warranting a [write] concurrency level of 1, thus optimizing for compactness and for read operations.
  • It is preferable to have the smallest possible memory footprint, to optimize cache behavior.

Since the default initial capacity is 31, we can potentially reduce our impact on the cache (as well as increase the likelihood for our dictionary to remain in the cache) by using a smaller initial capacity.

This raises the following questions:

  1. What does the capacity actually mean?

    • (A) That the dictionary does not need to grow to hold up to this many elements?
    • (B) A fixed percentage of A, depending on the dictionary's maximum "fullness", e.g. 75%?
    • (C) An approximation of A or B, depending on how the actual contents' hash codes distribute them?
  2. What does and does not constitute "a small prime"? Apparently, 31 does not. Does 11? Does 17? Does 23?

  3. If we do happen to want a capacity near a small prime, what capacity can we choose instead? Do we simply choose the nearest non-prime number, or are primes better for capacities and should we really choose a greater prime instead?

like image 419
Timo Avatar asked Jun 17 '26 22:06

Timo


1 Answers

In the reference source for ConcurrentDictionary<TKey, TValue> you can see:

Node[] buckets = new Node[capacity];

So, the capacity is the effective size of the hash table. No "fullness" is considered. The only pre-processing of this number is:

if (capacity < concurrencyLevel)
{
    capacity = concurrencyLevel;
}

where concurrencyLevel is either defined by you through a constructor parameter or is the default concurrency level defined as PlatformHelper.ProcessorCount.

The capacity is treated differently in Dictionary<TKey,TValue>. Here it is initialized with

private void Initialize(int capacity) {
    int size = HashHelpers.GetPrime(capacity);
    buckets = new int[size];
    ...
}

and HashHelpers.GetPrime gets the smallest prime which is greater than or equal to the specified capacity. Primes up to 7199369 are taken from a precalculated array. Lager ones are calculated "the hard way". It is interesting to note that the smallest considered prime is 3.

Unfortunately, HashHelpers is an internal class.

If I understand it right, both implementations resize the hash table based on the number of collisions and not based on a specific fill-factor ("fullness").

If you want to

  • optimize speed: take an initial capacity which is a prime about 30% bigger than the expected maximum dictionary size. This avoids resizing.
  • optimize the memory footprint: take a prime which is about 30% bigger than the minimum expected size.
  • a balance between speed and memory footprint: Take a number in between the two from above. But in any case, take a prime.
like image 133
Olivier Jacot-Descombes Avatar answered Jun 19 '26 13:06

Olivier Jacot-Descombes