Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do x64 projects use a default packing alignment of 16?

If you compile the following code in a x64 project in VS2012 without any /Zp flags:

#pragma pack(show)

then the compiler will spit out:

value of pragma pack(show) == 16

If the project uses Win32 then, the compiler will spit out:

value of pragma pack(show) == 8

What I don't understand is that the largest natural alignment of any type (ie: long long and pointer) in Win64 is 8. So why not just make the default alignment 8 for x64?

Somewhat related to that, why would anyone ever use /Zp16?

EDIT:

Here's an example to show what I'm talking about. Even though pointers have a natural alignment of 8 bytes for x64, Zp1 can force them to a 1 byte boundary.

struct A
{
    char a;
    char* b;
}

// Zp16
// Offset of a == 0
// Offset of b == 8

// Zp1
// Offset of a == 0
// Offset of b == 1

Now if we take an example that uses SSE:

struct A
{
    char a;
    char* b;
    __m128 c; // uses declspec(align(16)) in xmmintrinsic.h
}

// Zp16
// Offset of a == 0
// Offset of b == 8
// Offset of c == 16

// Zp1
// Offset of a == 0
// Offset of b == 1
// Offset of c == 16

If __m128 were truly a builtin type, then I'd expect the offset to be 9 with Zp1. But since it uses __declspec(align(16)) in its definition in xmmintrinsic.h, that trumps any Zp settings.

So here's my question worded a little differently: is there a type for 'c' that has a natural alignment of 16B but will have an offset of 9 in the previous example?

like image 213
lhumongous Avatar asked Apr 15 '13 14:04

lhumongous


People also ask

What is 64 byte aligned address?

64-bit aligned is 8 bytes aligned). A memory access is said to be aligned when the data being accessed is n bytes long and the datum address is n-byte aligned. When a memory access is not aligned, it is said to be misaligned. Note that by definition byte memory accesses are always aligned.

Why is data alignment important?

Alignment helps the CPU fetch data from memory in an efficient manner: less cache miss/flush, less bus transactions etc. Some memory types (e.g. RDRAM, DRAM etc.) need to be accessed in a structured manner (aligned "words" and in "burst transactions" i.e. many words at one time) in order to yield efficient results.


2 Answers

The MSDN page here includes the following relevant information about your question "why not make the default alignment 8 for x64?":

Writing applications that use the latest processor instructions introduces some new constraints and issues. In particular, many new instructions require that data must be aligned to 16-byte boundaries. Additionally, by aligning frequently used data to the cache line size of a specific processor, you improve cache performance. For example, if you define a structure whose size is less than 32 bytes, you may want to align it to 32 bytes to ensure that objects of that structure type are efficiently cached.

like image 87
Roger Rowland Avatar answered Nov 05 '22 06:11

Roger Rowland


Why do x64 projects use a default packing alignment of 16?

On x64 the floating point is performed in the SSE unit. You state that the largest type has alignment 8. But that is not correct. Some of the SSE intrinsic types, for example __m128, have alignment of 16.

like image 1
David Heffernan Avatar answered Nov 05 '22 06:11

David Heffernan