Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When extending a padded struct, why can't extra fields be placed in the tail padding?

Let's consider the structs :

struct S1 {     int a;     char b; };  struct S2 {     struct S1 s;       /* struct needed to make this compile as C without typedef */     char c; };  // For the C++ fans struct S3 : S1 {     char c; }; 

The size of S1 is 8, which is expected due to alignment. But the size of S2 and S3 is 12. Which means the compiler structure them as :

| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| |       a       | b |  padding  | c |  padding  | 

The compiler could place c in the padding in 6 7 8 without breaking alignment constraints. What is the rule that prevent it, and what is the reason behind it ?

like image 238
deadalnix Avatar asked Jun 08 '14 20:06

deadalnix


People also ask

What is structure padding why is it required How do you avoid structure padding?

The structural padding is an in-built process that is automatically done by the compiler. Sometimes it required to avoid the structure padding in C as it makes the size of the structure greater than the size of the structure members. We can avoid the structure padding in C in two ways: Using #pragma pack(1) directive.

Why does the compiler sometimes insert padding between fields and or at the end of a struct?

structure A If the short int element is immediately allocated after the char element, it will start at an odd address boundary. The compiler will insert a padding byte after the char to ensure short int will have an address multiple of 2 (i.e. 2 byte aligned).

How does struct padding work where are the extra bytes stored in memory?

Structure Padding in C: A 'char' of 1 byte can be allocated anywhere in memory like 0x5000 or 0x5001. And an 'int' of 4 bytes, must start at a 4-byte boundary like 0x5004 or 0x5008. The structure padding is automatically done by the compiler to make sure all its members are byte aligned.


1 Answers

Short answer (for the C++ part of the question): The Itanium ABI for C++ prohibits, for historical reasons, using the tail padding of a base subobject of POD type. Note that C++11 does not have such a prohibition. The relevant rule 3.9/2 that allows trivially-copyable types to be copied via their underlying representation explicitly excludes base subobjects.


Long answer: I will try and treat C++11 and C at once.

  1. The layout of S1 must include padding, since S1::a must be aligned for int, and an array S1[N] consists of contiguously allocated objects of type S1, each of whose a member must be so aligned.
  2. In C++, objects of a trivially-copyable type T that are not base subobjects can be treated as arrays of sizeof(T) bytes (i.e. you can cast an object pointer to an unsigned char * and treat the result as a pointer to the first element of a unsigned char[sizeof(T)], and the value of this array determines the object). Since all objects in C are of this kind, this explains S2 for C and C++.
  3. The interesting cases remaining for C++ are:
    1. base subobjects, which are not subject to the above rule (cf. C++11 3.9/2), and
    2. any object that is not of trivially-copyable type.

For 3.1, there are indeed common, popular "base layout optimizations" in which compilers "compress" the data members of a class into the base subobjects. This is most striking when the base class is empty (∞% size reduction!), but applies more generally. However, the Itanium ABI for C++ which I linked above and which many compilers implement forbids such tail padding compression when the respective base type is POD (and POD means trivially-copyable and standard-layout).

For 3.2 the same part of the Itanium ABI applies, though I don't currently believe that the C++11 standard actually mandates that arbitrary, non-trivially-copyable member objects must have the same size as a complete object of the same type.


Previous answer kept for reference.

I believe this is because S1 is standard-layout, and so for some reason the S1-subobject of S3 remains untouched. I'm not sure if that's mandated by the standard.

However, if we turn S1 into non-standard layout, we observe a layout optimization:

struct EB { };  struct S1 : EB {   // not standard-layout     EB eb;     int a;     char b; };  struct S3 : S1 {     char c; }; 

Now sizeof(S1) == sizeof(S3) == 12 on my platform. Live demo.

And here is a simpler example:

struct S1 { private:     int a; public:     char b; };  struct S3 : S1 {     char c; }; 

The mixed access makes S1 non-standard-layout. (Now sizeof(S1) == sizeof(S3) == 8.)

Update: The defining factor seems to be triviality as well as standard-layoutness, i.e. the class must be POD. The following non-POD standard-layout class is base-layout optimizable:

struct S1 {     ~S1(){}     int a;     char b; };  struct S3 : S1 {     char c; }; 

Again sizeof(S1) == sizeof(S3) == 8. Demo

like image 135
Kerrek SB Avatar answered Sep 22 '22 05:09

Kerrek SB