I've been reading through Clang source code and discovered something interesting about the ARM C++ ABI that I can't seem to understand the justification for. From the an online version of the ARM ABI documentation:
This ABI requires C1 and C2 constructors to return this (instead of being void functions) so that a C3 constructor can tail call the C1 constructor and the C1 constructor can tail call C2.
(and similarly for non-virtual destructors)
I'm not sure what C1
, C2
, and C3
reference here...this section is meant to be a modification of §3.1.5 from the generic (i.e. Itanium) ABI, but that section (at least in this online verison) simply states:
Constructors return void results.
Anyway, I really can't figure out what the purpose of this is: how does making a constructor return this allow tail call optimization, and in what circumstances?
As far I can tell, the only time a constructor could tail call another with the same this
return value would be the case of a derived class with a single base class, a trivial constructor body, no members with non-trivial constructors, and no virtual table pointer. In fact, it seems like it would actually be easier, not harder, to optimize with a tail call with a void
return, because then the restriction of a single base class could be eliminated (in the multiple base class case, the this
pointer returned from the last called constructor will not be the this
pointer of the derived object).
What am I missing here? Is there something about the ARM calling convention that makes the this
return necessary?
Ok, helpful link from @Michael made this all clear...C1
, C2
, and C3
refer to the name-mangling of the "complete object constructor", "base object constructor", and "complete object allocating constructor", respectively, from the Itanium ABI:
<ctor-dtor-name> ::= C1 # complete object constructor
::= C2 # base object constructor
::= C3 # complete object allocating constructor
::= D0 # deleting destructor
::= D1 # complete object destructor
::= D2 # base object destructor
The C3
/"complete object allocating constructor" is a version of the constructor that, rather than operating on already allocated storage passed to it via the this
parameter, allocates memory internally (via operator new
) and then calls the C1
/"complete object constructor", which is the normal constructor used for the complete object case. Since the C3
constructor must return the this
pointer to the newly allocated and constructed object, the C1
constructor must also return the this
pointer in order for a tail call to be used.
The C2
/"base object constructor" is the constructor called by derived classes when constructing a base class subobject; the semantics of C1
and C2
constructors differ in case of virtual
inheritance and could be implemented differently for optimization purposes as well. In the case of virtual
inheritance, a C1
constructor could be implemented with calls to virtual
base class constructors followed by a tail call to a C2
constructor, so the latter should also return this
if the former does.
The destructor case is slightly different but related. As per the ARM ABI:
Similarly, we require D2 and D1 to return this so that D0 need not save and restore this and D1 can tail call D2 (if there are no virtual bases). D0 is still a void function.
The D0
/"deleting destructor" is used when deleting an object, it calls the D1
/"complete object destructor" and calls operator delete
with the this
pointer afterwards to free the memory. Having the D1
destructor return this
allows the D0
destructor to use its return value to call operator delete
, rather than having to save it to another register or spill it to memory; similarly, the D2
/"base object destructor" should return this
as well.
The ARM ABI also adds:
We do not require thunks to virtual destructors to return this. Such a thunk would have to adjust the destructor’s result, preventing it from tail calling the destructor, and nullifying any possible saving.
Consequently, only non-virtual calls of D1 and D2 destructors can be relied on to return this.
If I understand this correctly, it means that this save-restore-elision optimization can only be used when D0
calls D1
statically (i.e. in the case of a non-virtual
destructor).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With