I've been trying to figure out the overhead of a string in .NET 4 x64. This is what I've got so far.
So you'd expect a 1 character string to be 16 + 4 + 4 = 24 bytes. It's divisible by 8 so it shouldn't need any padding.
But when I look at the sizes in WinDbg I see them taking 32 bytes. When I !dumpobject
them they say their size is 28 bytes, which is what I assume is getting rounded up to 32. What's going on? Is there another round of memory alignment happening?
So this string implementation is 32 because that's the way it was built in this implementation and it will by 16 in other implementations and 64 in yet another. The size of the string will (like water) depend on the environment it is used in.
A byte is eight bits, a word is 2 bytes (16 bits), a doubleword is 4 bytes (32 bits), and a quadword is 8 bytes (64 bits).
So a string size is 18 + (2 * number of characters) bytes. (In reality, another 2 bytes is sometimes used for packing to ensure 32-bit alignment, but I'll ignore that). 2 bytes is needed for each character, since . NET strings are UTF-16.
I suspect that the first character is aligned on an 8-byte boundary on x64, so that when passed as a pointer to unmanaged code, it's a properly-aligned pointer. Your figures certainly fit in with the ones I got measuring string size recently, leading to formulae of:
32 bit: 14 + length * 2 (rounded up to 4 bytes)
64 bit: 26 + length * 2 (rounded up to 8 bytes)
So in a 64 bit CLR, even a 0-length string takes 32 bytes by my reckoning.
Rounding up to paragraph (16-byte) boundaries to optimize cache line fills on Intel processors?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With