Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Member subobjects of zero size. Why not?

Tags:

c++

This is the third question in the series of zero-size objects and subobjects today. The standars clearly implies that member subobjects cannot have zero size whereas base class subobjects can.

struct X {}; //empty class, complete objects of class X have nonzero size
struct Y:X { char c; }; //Y's size may be 1 
struct Z {X x; char c;}; //Z's size MUST be greater than 1

Why not allow zero-size member-subobjects JUST LIKE zero-size base-class-subobjects?

TIA

EDITING after Konrad's answer: consider the following example:

struct X{}; 
struct D1:X{};
struct D2:D1, X {}; //D2 has 2 distinct subobjects of type X, can they both be 0 size and located at the same address?

If two base class subobjects of the same type X, as in my example, can be located at the same address, so should be able to member subobjects. If they cannot, then compiler treats this case specially and so can it treat Konrad's example(see answer below) specially and disallow zero-size member subobjects if there are multiple of the same type in the same class. Where am I wrong?

like image 551
Armen Tsirunyan Avatar asked Oct 12 '10 11:10

Armen Tsirunyan


3 Answers

Why not allow zero-size member-subobjects

Because then several (sub)objects of the same type could have the same address and this is forbidden by the standard:

struct X { virtual ~X() { /* Just so we can use typeid! */ } };
struct Y {
    X a;
    X b;
};

Y y;
// The standard requires that the following holds:
assert(typeid(y.a) != typeid(y.b) or &y.a != &y.b);

This is somewhat logical: otherwise, these two objects would be the same for all intents and purposes (since an object’s identity is solely determined by its type and its memory address) and declaring them separately would make no sense.

like image 136
Konrad Rudolph Avatar answered Oct 02 '22 00:10

Konrad Rudolph


I think your question is good - in fact, some of the earlier articles about the empty base class optimization talk about the "empty member optimization" and specifically say that a similar optimization could apply to members. Maybe, before there was a standard, some compilers did this.

This is just idle speculation, I don't have much to back it up, but I had a look at some of these areas of the standard yesterday.

C compatibility

In this example:

struct X{};
struct Y{};
struct Z {
   struct X x;
   struct Y y;
   int i;
};

Z would be a POD under C++03 rules, but would not be layout-compatible with C if x and y were zero-sized subobjects. C layout compatibility is one of the reasons for the existence of PODs. This problem can't happen with base classes because C doesn't have base classes, and classes with base classes aren't PODs in C++03, so all bets are off :).Visitor has noted that C doesn't support empty structs, in fact. So this entire argument is wrong. I'd just assumed that it did - it seemed like a harmless generalization.

Further, programs seem to assume things, like that y has a greater address than x, and stuff like that - this is guaranteed by the relational operators on pointers in 5.10/2. I don't really know if there's a compelling reason to allow this, or how many programs use it in practice.

IMO, this is the strongest argument of all of these.

Doesn't generalize well to arrays

Continuing the above example, add this:

struct Z1 {
   struct X x[1];
   struct Y y[1];
   int i;
};

...one might expect that sizeof(Z1) == sizeof(Z), and that x and y also work as normal arrays do (i.e. that you can form a past-the-end pointer, and that pointer is different to any element's address). One of these expectations would be broken with zero-sized subobjects.

Less compelling than the base case

One of the main reasons for deriving from an empty base is because it's a policy or interface type class. These are often empty, and requiring them to take up space would impose an "abstraction penalty", i.e. make better organized code more bloated. This is something that Stroustrup doesn't want in C++ - he wants appropriate abstraction for minimal runtime cost.

On the other hand, if you declare a member of a type, you don't inherit its functions, typedefs, etc; and you don't get the special pointer conversions from derived to base, so perhaps there's less reason to have a zero-sized member than a zero-sized base.

A counterexample here is something like the Allocator policy class in STL containers - you don't necessarily want your containers to derive from it, but you want to "keep it around" without it taking up overhead.

Empty base class case covers most uses

...you can use private inheritance instead of declaring a member if you're worried about the space overhead. It's not quite as direct, but you can more or less achieve the same thing. Obviously this doesn't work so well if you've got lots of empty members that you would like to take up zero space.

It's another special case

There are quite a few subtle things that don't work with this optimization. For example, you can't memcpy the bits of a zero-sized POD subobject into a char array, and then back, or between zero-sized subobjects. I've seen people implement operator= using memcpy (I don't know why...) which would break this sort of thing. Presumably it's less of a problem to break such things for base classes instead of members.

like image 35
Doug Avatar answered Oct 01 '22 22:10

Doug


The compiler could conditionally allow zero-size member objects as well as base classes, sure, but it would be more complex. The empty base class optimization always applies, regardless of type. Any time the compiler sees a class derive from a class with no data members, it can use the empty base class optimization.

Following @Konrad Rudolphs example through, with member objects, it'd have to check the type, verify that no other object of the same type exists at that location, and then maybe apply your optimization. Well, unless the member object is located at the end of the containing class. If so, then the object's "real" (non-zero) size would protrude past the end of the containing class, which would also be an error. That can never happen in the base class case because we know that the base class is located at the beginning of the derived class, and the derived class has non-zero size.

So such an optimization would be more complex, more subtle, and more likely to break in unexpected ways.

I can't cite any cases off-hand where zero-size member objects would definitely break, but I'm not convinced that they don't exist either. I've already pointed out a couple of limitations that don't exist in the base-class case, and most likely, more exist. So the question is, how much complexity and uncertainty should the language allow just to make one rarely useful optimization possible?

like image 31
jalf Avatar answered Oct 01 '22 22:10

jalf