Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is a 1 character .NET string 32 bytes in x64?

I've been trying to figure out the overhead of a string in .NET 4 x64. This is what I've got so far.

  • 16 byte object header for x64
  • 4 bytes for the stringLength field (arrayLength is gone in .NET 4)
  • (length + 1) * 2 bytes for the string content (UTF-16, null terminated)

So you'd expect a 1 character string to be 16 + 4 + 4 = 24 bytes. It's divisible by 8 so it shouldn't need any padding.

But when I look at the sizes in WinDbg I see them taking 32 bytes. When I !dumpobject them they say their size is 28 bytes, which is what I assume is getting rounded up to 32. What's going on? Is there another round of memory alignment happening?

like image 337
RandomEngy Avatar asked Oct 10 '11 18:10

RandomEngy


People also ask

Why is size 32 string?

So this string implementation is 32 because that's the way it was built in this implementation and it will by 16 in other implementations and 64 in yet another. The size of the string will (like water) depend on the environment it is used in.

How many bytes are in 64 bits string?

A byte is eight bits, a word is 2 bytes (16 bits), a doubleword is 4 bytes (32 bits), and a quadword is 8 bytes (64 bits).

How many bytes is a string character?

So a string size is 18 + (2 * number of characters) bytes. (In reality, another 2 bytes is sometimes used for packing to ensure 32-bit alignment, but I'll ignore that). 2 bytes is needed for each character, since . NET strings are UTF-16.


2 Answers

I suspect that the first character is aligned on an 8-byte boundary on x64, so that when passed as a pointer to unmanaged code, it's a properly-aligned pointer. Your figures certainly fit in with the ones I got measuring string size recently, leading to formulae of:

32 bit: 14 + length * 2 (rounded up to 4 bytes) 
64 bit: 26 + length * 2 (rounded up to 8 bytes)

So in a 64 bit CLR, even a 0-length string takes 32 bytes by my reckoning.

like image 194
Jon Skeet Avatar answered Oct 21 '22 18:10

Jon Skeet


Rounding up to paragraph (16-byte) boundaries to optimize cache line fills on Intel processors?

like image 40
Brian Knoblauch Avatar answered Oct 21 '22 17:10

Brian Knoblauch