I have some code that I added a nested dictionary to, of the following format
Dictionary<string, Dictionary<string, Dictionary<string, float>>>
After doing so I noticed the memory usage of my application shot up SIGNIFICANTLY. These dictionaries are keyed on strings that are often repeated, and there are many of these dictionaries, on the order of 10's of thousands.
In order to address this problem I hypothesized that the repeated strings were eating up a significant amount of memory. My solution was to hash the strings and use an integer instead (I would keep one copy of a rainbow table so I could reverse the hash when necessary)
Dictionary<int, Dictionary<int, Dictionary<int, float>>>
So I went to a memory profiler to see what kind of size reduction I could get. To my shock I actually found that the string storage was actually smaller in size (both normal and inclusive).
This doesn't make intuitive sense to me. Even if the compiler was smart enough to only store one copy of the string and use a reference, I would think that reference would be a pointer which is double the size of an int. I also didn't use any String.Intern
methods so I don't know how this would have been accomplished (also is String.Intern
the right method here?)
I'm very confused as to what's happening under the hood, any help would be appreciated
That means you need about 1 MB of memory to store the entire dictionary. Of course you can deflate it (zip) and store in memory, with a compression rate of about 90%. That means 100 KB.
The dictionary is stored in memory in the form of a hash table with an ordered array of ranges and their corresponding values. The dictionary key has the UInt64 type. This storage method works the same way as hashed and allows using date/time (arbitrary numeric type) ranges in addition to the key.
If your keys and values are objects, there's approximately 20 bytes of overhead for each element of a dictionary, plus several more bytes per dictionary. This is in addition to the space consumed by the keys and values themselves.
Dictionary uses a hash lookup, while your list requires walking through the list until it finds the result from beginning to the result each time. to put it another way. The list will be faster than the dictionary on the first item, because there's nothing to look up.
If your keys and values are objects, there's approximately 20 bytes of overhead for each element of a dictionary, plus several more bytes per dictionary. This is in addition to the space consumed by the keys and values themselves. if you have value types as keys and values, then it's 12 bytes plus the space consumed by the key and value for each item in the dictionary. This is if the number of elements equals the internal dictionary capacity. But typically there is more capacity than elements, so there is wasted space.
The wasted space will generally be a higher relative percentage if you have lots of dictionaries with a small number of elements than if you had one dictionary with many elements. If I go by your comment, your dictionaries with 8 elements will have a capacity of 11, those with 2 elements will have a capacity of 3, and those with 10 will have a capacity of 11.
If I understand your nesting counts, then a single top level dictionary will represent 184 dictionary elements. But if we count unused capacity, it's closer to 200 as far as space consumption. 200 * 20 = 4000 bytes for each top level dictionary. How many of those do you have? You say 10's of thousands of them in thousand of objects. Every 10,000 is going to consume about 38 MB of dictionary overhead. Add to that the objects stored in the dictionary.
A possible explanation of why your attempt to make it smaller by managing the hash codes would be if there are not a lot of duplicated references to your keys. Replacing an object reference key with an int key doesn't change the dictionary overhead amount, and you're adding the storage of your new collection of hash codes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With