I've been reading through Clang source code and discovered something interesting about the ARM C++ ABI that I can't seem to understand the justification for. From the an online version of the ARM ABI documentation:
This ABI requires C1 and C2 constructors to return this (instead of being void functions) so that a C3 constructor can tail call the C1 constructor and the C1 constructor can tail call C2.
(and similarly for non-virtual destructors)
I'm not sure what C1, C2, and C3 reference here...this section is meant to be a modification of §3.1.5 from the generic (i.e. Itanium) ABI, but that section (at least in this online verison) simply states:
Constructors return void results.
Anyway, I really can't figure out what the purpose of this is: how does making a constructor return this allow tail call optimization, and in what circumstances?
As far I can tell, the only time a constructor could tail call another with the same this return value would be the case of a derived class with a single base class, a trivial constructor body, no members with non-trivial constructors, and no virtual table pointer. In fact, it seems like it would actually be easier, not harder, to optimize with a tail call with a void return, because then the restriction of a single base class could be eliminated (in the multiple base class case, the this pointer returned from the last called constructor will not be the this pointer of the derived object).
What am I missing here? Is there something about the ARM calling convention that makes the this return necessary?
Ok, helpful link from @Michael made this all clear...C1, C2, and C3 refer to the name-mangling of the "complete object constructor", "base object constructor", and "complete object allocating constructor", respectively, from the Itanium ABI:
  <ctor-dtor-name> ::= C1   # complete object constructor
                   ::= C2   # base object constructor
                   ::= C3   # complete object allocating constructor
                   ::= D0   # deleting destructor
                   ::= D1   # complete object destructor
                   ::= D2   # base object destructor
The C3/"complete object allocating constructor" is a version of the constructor that, rather than operating on already allocated storage passed to it via the this parameter, allocates memory internally (via operator new) and then calls the C1/"complete object constructor", which is the normal constructor used for the complete object case. Since the C3 constructor must return the this pointer to the newly allocated and constructed object, the C1 constructor must also return the this pointer in order for a tail call to be used.
The C2/"base object constructor" is the constructor called by derived classes when constructing a base class subobject; the semantics of C1 and C2 constructors differ in case of virtual inheritance and could be implemented differently for optimization purposes as well. In the case of virtual inheritance, a C1 constructor could be implemented with calls to virtual base class constructors followed by a tail call to a C2 constructor, so the latter should also return this if the former does.
The destructor case is slightly different but related. As per the ARM ABI:
Similarly, we require D2 and D1 to return this so that D0 need not save and restore this and D1 can tail call D2 (if there are no virtual bases). D0 is still a void function.
The D0/"deleting destructor" is used when deleting an object, it calls the D1/"complete object destructor" and calls operator delete with the this pointer afterwards to free the memory. Having the D1 destructor return this allows the D0 destructor to use its return value to call operator delete, rather than having to save it to another register or spill it to memory; similarly, the D2/"base object destructor" should return this as well.
The ARM ABI also adds:
We do not require thunks to virtual destructors to return this. Such a thunk would have to adjust the destructor’s result, preventing it from tail calling the destructor, and nullifying any possible saving.
Consequently, only non-virtual calls of D1 and D2 destructors can be relied on to return this.
If I understand this correctly, it means that this save-restore-elision optimization can only be used when D0 calls D1 statically (i.e. in the case of a non-virtual destructor).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With