Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does this struct padding trick work?

Consider this simple program

#include <iostream>

struct A
{
    int   x1234;
    short x56;
    char  x7;
};

struct B : A
{
    char x8;
};

int main()
{
    std::cout << sizeof(A) << ' ' << sizeof(B) << '\n';
    return 0;
}

This prints 8 12. Even though B could be packed into 8 bytes without breaking alignment requirements, instead it takes up a greedy 12 bytes.

It would be nice to have sizeof(B) == 8, but the answer to Is the size of a struct required to be an exact multiple of the alignment of that struct? suggests that there isn't a way.

I was therefore surprised when the following

struct MakePackable
{
};

struct A : MakePackable
{
    int   x1234;
    short x56;
    char  x7;
};

struct B : A
{
    char x8;
};

printed 8 8.

What is going on here? I suspect that standard-layout-types have something to do with it. If so, then what is the rationale for it causing the above behaviour, when the only purpose of that feature is to ensure binary-compatibility with C?


EDIT: As others have pointed out this is ABI or compiler specific, so I ought to add that this behaviour was observed on x86_64-unknown-linux-gnu with the following compilers:

  • clang 3.6
  • gcc 5.1

I have also noticed something strange from clang's struct dumper. If we ask for the data size without tail padding ("dsize"),

          A   B
first     8   9
second    7   8

then in the first example we get dsize(A) == 8. Why is this not 7?

like image 932
PBS Avatar asked Jul 07 '15 00:07

PBS


People also ask

Why struct padding is needed?

'Why is structure padding required?' The answer to that lies in how a CPU accesses memory. Typically a CPU has alignment constraints, e.g. a CPU will access one word at a time, or a CPU will require data to be 16byte aligned, etc.

How does structure padding prevent structure padding?

In Structure, sometimes the size of the structure is more than the size of all structures members because of structure padding. Note: But what actual size of all structure member is 13 Bytes. So here total 3 bytes are wasted. So, to avoid structure padding we can use pragma pack as well as an attribute.

Why do struct types in C sometimes contain padding?

The variable 'c' is of 4 bytes, so it can be accessed in one cycle also, but in this scenario, it is utilizing 2 cycles. This is an unnecessary wastage of CPU cycles. Due to this reason, the structure padding concept was introduced to save the number of CPU cycles.


2 Answers

This is a data point although not a complete answer.

Say we have (as a complete translation unit, not a snippet):

struct X {};

struct A
{
    int   x1234;
    short x56;
    char  x7;
}

void func(A &dst, A const &src) 
{
    dst = src;
}

With g++, this function is compiled to:

movq    (%rdx), %rax
movq    %rax, (%rcx)

However if struct A : X is used instead, then this function is:

movl    (%rdx), %eax
movl    %eax, (%rcx)
movzwl  4(%rdx), %eax
movw    %ax, 4(%rcx)
movzbl  6(%rdx), %eax
movb    %al, 6(%rcx)

These two cases actually correspond to the sizes being 8 12 and 8 8 respectively in OP's example.

The reason for this is fairly clear: A might be used as a base for some class B, and then the call func(b, a); must be careful not to disturb other members of b that might reside in the padding area (b.x8 in OP's example);

I cannot see any particular property of A : X in the C++ standard which would make g++ decide that the padding is re-usable in struct A : X, but not in struct A. Both A and A : Xare trivially copyable, standard layout and POD.

I guess it must just be an optimization decision based on typical usage. The version without re-use will be faster to copy. Maybe a g++ ABI designer could comment?

Interestingly, this example shows that being trivially copyable does not imply that memcpy(&b, &a, sizeof b) is equivalent to b = a !

like image 131
M.M Avatar answered Oct 23 '22 19:10

M.M


I'm not a real language lawyer of C++, however what I've found so far is:

Referencing the answers in this question, a struct only remains a standard layout POD while there is only 1 class with non-static members among itself and its parent classes. So under that idea A has a guaranteed layout in both cases, but B does not in either case.

Supporting this is the fact that std::is_pod is true for A and false for B in both.

  • First case: http://ideone.com/jyPb5J
  • Second case: http://ideone.com/bYcLXa

So if I'm understanding this correctly myself, the compiler is allowed some room to do what it wants with the layout of B in both cases. And apparently in the second case it feels like making use of what would otherwise have been the padding byte of A.

like image 25
TheUndeadFish Avatar answered Oct 23 '22 18:10

TheUndeadFish