I'm looking into the new, relaxed POD definition in C++11 (section 9.7) <blockquote> A standard-layout class is a class that: <ul> <li>has no non-static data members of type non-standard-layout class (or array of such types) or reference,</li> <li>has no virtual functions (10.3) and no virtual base classes (10.1),</li> <li>has the same access control (Clause 11) for all non-static data members,</li> <li>has no non-standard-layout base classes,</li> <li>either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and</li> <li>has no base classes of the same type as the ﬁrst non-static data member.</li> </ul> </blockquote> I've highlighted the bits that surprised me. What would go wrong if we tolerated data members with varying access controls? What would go wrong if the first data member was also a base class? i.e. <pre class="prettyprint"><code>struct Foo {}; struct Good : Foo {int x; Foo y;}; struct Bad : Foo {Foo y; int x;}; </code></pre> I admit it's a weird construction, but why should <code>Bad</code> be prohibited but not <code>Good</code>? Finally, what would go wrong if more than one constituent class had data members?

It's basically about compatibility with C++03 and C: <ul> <li>same access control - C++03 implementations are allowed to use access control specifiers as an opportunity to re-order the (groups of) members of a class, for example in order to pack it better.</li> <li>more than one class in the hierarchy with non-static data members - C++03 doesn't say where base classes are located, or whether padding is elided in base class subobjects that would be present in a complete object of the same type.</li> <li>base class and first member of the same type - because of the second rule, if the base class type is used for a data member, then it must be an empty class. Many compilers do implement the empty base class optimization, so what Andreas says about the sub-objects having the same address would be true. I'm not sure though what it is about standard-layout classes that means it's bad for the base class subobject to have the same address as a first data member of the same type, but it doesn't matter when the base class subobject has the same address as a first data member of a different type. [Edit: it's because different objects of the same type have different addresses, even if they're empty sub-objects. Thanks to Johannes]</li> </ul> C++0x probably could have defined that those things are standard-layout types too, in which case it would also define how they're laid out, to the same extent it does for standard-layout types. Johannes's answer goes into this further, look at his example of a nice property of standard-layout classes that these things interfere with. But if it did that, then some implementations would be forced to change how they lay out the classes to match the new requirements, which is a nuisance for struct compatibility between different versions of that compiler pre- and post- C++0x. It breaks the C++ ABI, basically. My understanding of how standard layout was defined is that they looked at what POD requirements could be relaxed without breaking existing implementations. So I assume without checking, that the above are examples where some existing C++03 implementation does use the non-POD nature of the class to do something that's incompatible with standard layout.

Why is C++11's POD "standard layout" definition the way it is?

Tags:

c++

c++11

standard-layout

I'm looking into the new, relaxed POD definition in C++11 (section 9.7)

A standard-layout class is a class that:

has no non-static data members of type non-standard-layout class (or array of such types) or reference,

has no virtual functions (10.3) and no virtual base classes (10.1),

has the same access control (Clause 11) for all non-static data members,

has no non-standard-layout base classes,

either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and

has no base classes of the same type as the ﬁrst non-static data member.

I've highlighted the bits that surprised me.

What would go wrong if we tolerated data members with varying access controls?

What would go wrong if the first data member was also a base class? i.e.

struct Foo {}; struct Good : Foo {int x; Foo y;}; struct Bad  : Foo {Foo y; int x;};

I admit it's a weird construction, but why should Bad be prohibited but not Good?

Finally, what would go wrong if more than one constituent class had data members?

458

asked Aug 23 '11 12:08

spraff

2 Answers

You are allowed to cast a standard layout class object address to a pointer to its first member and back by one of the later paragraphs, which is also often done in C:

struct A { int x; }; A a;  // "px" is guaranteed to point to a.x int *px = (int*) &a;  // guaranteed to point to a A *pa = (A*)px;

For that to work, the first member and the complete object have to have the same address (the compiler cannot adjust the int pointer by any bytes because it can't know whether it's a member of an A or not).

Finally, what would go wrong if more than one constituent class had data members?

Within a class, members are allocated in increasing addresses according to the declaration order. However C++ doesn't dictate the order of allocation for data members across classes. If both the derived class and base class had data members, the Standard doesn't define an order for their addresses on purpose, so as to give an implementation full flexibility in layouting memory. But for the above cast to work, you need to know what is the "first" member in allocation order!

What would go wrong if the first data member was also a base class?

If the base class has the same type as the first data member, implementations that place the base classes before the derived class objects in memory would need to have a padding byte before the derived class object data members in memory (base class would have size one), to avoid having the same address for both the base class and the first data member (in C++, two distinct objects of the same type always have different addresses). But that would again make impossible to cast the address of the derived class object to the type of its first data member.

answered Sep 29 '22 11:09

Johannes Schaub - litb

It's basically about compatibility with C++03 and C:

same access control - C++03 implementations are allowed to use access control specifiers as an opportunity to re-order the (groups of) members of a class, for example in order to pack it better.
more than one class in the hierarchy with non-static data members - C++03 doesn't say where base classes are located, or whether padding is elided in base class subobjects that would be present in a complete object of the same type.
base class and first member of the same type - because of the second rule, if the base class type is used for a data member, then it must be an empty class. Many compilers do implement the empty base class optimization, so what Andreas says about the sub-objects having the same address would be true. I'm not sure though what it is about standard-layout classes that means it's bad for the base class subobject to have the same address as a first data member of the same type, but it doesn't matter when the base class subobject has the same address as a first data member of a different type. [Edit: it's because different objects of the same type have different addresses, even if they're empty sub-objects. Thanks to Johannes]

C++0x probably could have defined that those things are standard-layout types too, in which case it would also define how they're laid out, to the same extent it does for standard-layout types. Johannes's answer goes into this further, look at his example of a nice property of standard-layout classes that these things interfere with.

But if it did that, then some implementations would be forced to change how they lay out the classes to match the new requirements, which is a nuisance for struct compatibility between different versions of that compiler pre- and post- C++0x. It breaks the C++ ABI, basically.

My understanding of how standard layout was defined is that they looked at what POD requirements could be relaxed without breaking existing implementations. So I assume without checking, that the above are examples where some existing C++03 implementation does use the non-POD nature of the class to do something that's incompatible with standard layout.

answered Sep 29 '22 11:09

Steve Jessop

Related questions
                            
                                c++ standard practice: virtual interface classes vs. templates
                            
                                What's a good and stable C++ tree implementation?
                            
                                Does dereferencing a pointer make a copy of it?
                            
                                How to define a C++ preprocessor macro through the command line with CMake?
                            
                                What does "[ this ]" mean in C++
                            
                                Why does std::declval add a reference?
                            
                                Automatically stop Visual C++ 2008 build at first compile error?
                            
                                Multicharacter literal in C and C++
                            
                                member initializer does not name a non-static data member or base class [duplicate]
                            
                                What happens to unique_ptr after std::move()?
                            
                                How to write an unsigned short int literal?
                            
                                Move constructor for std::mutex
                            
                                Is incrementing a null pointer well-defined?
                            
                                Why is the Loki library not more widely used?
                            
                                C++11 STL containers and thread safety
                            
                                Is there any reason to use std::map::emplace() instead of try_emplace() in C++1z?
                            
                                What exactly is std::labs() there for?
                            
                                What is the meaning of a C++ Wrapper Class?
                            
                                static constexpr member of same type as class being defined
                            
                                Does casting to an int after std::floor guarantee the right result?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With