I have an application that have ~1,000,000 strings in memory for performance reasons. My application consumes ~200 MB RAM.
I want to reduce the amount of memory consumed by the strings.
I know .NET represents strings in UTF-16 encoding (2 byte per char). Most strings in my application contain pure english chars, so storing them in UTF-8 encoding will be 2 times more efficient than UTF-16.
Is there a way to store a string in memory in UTF-8 encoding while allowing standard string functions? (My needs including mostly IndexOf with StringComparison.OrdinalIgnoreCase).
Unfortunately, you can't change .Net internal representation of string. My guess is that the CLR is optimized for multibyte strings.
What you are dealing with is the famous paradigm of the Space-time tradeoff, which states that in order to gain memory you'll have to use more processor, or you can save processor by using some memory.
That said, take a look at some considerations here. If I were you, once established that the memory gain will be enough for you, do try to write your own "string" class, which uses ASCII encoding. This will probably suffice.
UPDATE:
More on the money, you should check this post, "Of memory and strings", by StackOverflow legend Jon Skeet which deals with the problem you are facing. Sorry I didn't mentioned it right away, it took me some time to find the exact post from Jon.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With