The following code seems to run fine when compiled with clang or gcc (on macOS), but crashes when compiled with MS Visual C++ 2017. On the latter, the foo_clone object appears to be corrupted and the program crashes with an access violation on the line foo_clone->get_identifier()
.
It does work on VC++ if I remove the covariant return types (all clone-methods return IDO*
), or when std::enable_shared_from_this
is removed, or when all inheritance is made virtual.
Why does it work with clang/gcc but not with VC++?
#include <memory>
#include <iostream>
class IDO {
public:
virtual ~IDO() = default;
virtual const char* get_identifier() const = 0;
virtual IDO* clone() const = 0;
};
class DO
: public virtual IDO
, public std::enable_shared_from_this<DO>
{
public:
const char* get_identifier() const override { return "ok"; }
};
class D : public virtual IDO, public DO {
D* clone() const override {
return nullptr;
}
};
class IA : public virtual IDO {};
class Foo : public IA, public D {
public:
Foo* clone() const override {
return new Foo();
}
};
int main(int argc, char* argv[]) {
Foo* foo = new Foo();
Foo* foo_clone = foo->clone();
foo_clone->get_identifier();
}
Message:
Exception thrown at 0x00007FF60940180B in foo.exe: 0xC0000005: Access violation reading location 0x0000000000000004.
This appears to be a miscompilation by VC++. It going away when the enable_shared_from_this
isn't there is a red herring; the problem is just masked.
Some background: resolving overridden functions in C++ generally happens via vtables. However, in the presence of multiple and virtual inheritance and co-variant return types, there are some challenges that must be met, and different ways of meeting them.
Consider:
Foo* foo = new Foo();
IDO* ido = foo;
D* d = foo;
foo->clone(); // must call Foo::clone() and return a Foo*
ido->clone(); // must call Foo::clone() and return an IDO*
d->clone(); // must call Foo::clone() and return a D*
Keep in mind that Foo::clone()
returns a Foo*
no matter what, and conversion from Foo*
to IDO*
or D*
is not a simple no-op. Within the complete Foo
object, the IDO
subobject lives at offset 32 (assuming MSVC++ and 64-bit compilation), and the D
subobject lives at offset 8. To get from a Foo*
to a D*
means adding 8 to the pointer, and getting an IDO*
actually means loading information from the Foo*
where exactly the IDO
subobject is located.
However, let's look at the vtable generated for all these classes. A vtable for IDO
has this layout:
0: destructor
1: const char* get_identifier() const
2: IDO* clone() const
A vtable for D
has this layout:
0: destructor
1: const char* get_identifier() const
2: IDO* clone() const
3: D* clone() const
Slot 2 is there because the base class IDO
has this function. Slot 3 is there because this function exists too. Could we omit this slot, and instead generate additional code at callsites to convert from the IDO*
to the D*
? Perhaps, but that would be less efficient.
A vtable for Foo
looks like this:
0: destructor
1: const char* get_identifier() const
2: IDO* clone() const
3: D* clone() const
4: Foo* clone() const
5: Foo* clone() const
Again, it inherits the layout of D
and appends its own slots. I actually have no idea why there are two new slots - it's probably just a suboptimal algorithm that sticks around for compatibility reasons.
Now, what do we put into these slots for a concrete object of type Foo
? Slots 4 and 5 simple get Foo::clone()
. But that function returns a Foo*
, so it's not suitable for slots 2 and 3. For these, the compiler creates stubs (called thunks) that call the main version and convert the result, i.e. the compiler creates something like this for slot 3:
D* Foo::clone$D() const {
Foo* real = clone();
return static_cast<D*>(real);
}
Now we get to the miscompilation: for some reason, the compiler, upon seeing this call:
foo->clone();
calls not slot 4 or 5, but slot 3. But slot 3 returns a D*
! The code then just continues to use that D*
as a Foo*
, or in other words, you get the same kind of behavior as if you had done:
Foo* wtf = reinterpret_cast<Foo*>(
reinterpret_cast<char*>(foo_clone) + 8);
This obviously isn't going to end well.
Specifically, what happens is that in the call foo_clone->get_identifier();
the compiler wants to cast the Foo* foo_clone
to an IDO*
(get_identifier
requires its this
pointer to be an IDO*
because it was originally declared in IDO
). As I mentioned before, the exact position of the IDO
object within any Foo
object is not fixed; it depends on the full type of the object (it is 32 if the full object is a Foo
, but it might be something else if it was a class derived from Foo
). To do the conversion, the compiler must therefore load the offset from within the object. Specifically, it can load the "virtual base pointer" (vbptr) located at offset 0 of any Foo
object, which points to a "virtual base table" (vbtable), which contains the offset.
But remember, we have a corrupted Foo*
that already points to offset 8 of the real object. So we access offset 0 of offset 8, and what's there? Well, as it happens, what's there is the weak_ptr
from the enable_shared_from_this
object, and it is null. So we get null for the vbptr, and the attempt to dereference it to get the object crashes. (The offset to the virtual base is stored at offset 4 in the vbtable, which is why the crash address you get is 0x000...004.)
If you remove all the covariant shenanigans, the vtable shrinks to a nice single entry for clone()
and the miscompilation doesn't appear.
But why does the problem go away if you remove the enable_shared_from_this
? Well, because then the thing at offset 8 is not some null pointer inside a weak_ptr
, but the vbptr of the DO
subobject. (Generally speaking, every branch of a inheritance graph gets its own vbptr. IA
has one that Foo
shares, and DO
has one that D
shares.) And that vbptr contains the information needed to convert a D*
to an IDO*
. Our Foo*
is really a D*
in disguise, so everything just happens to work out correctly.
Appendix
The MSVC++ compiler has an undocumented option to dump object layouts. Here is its output for Foo
with the enable_shared_from_this
:
class Foo size(40):
+---
0 | +--- (base class IA)
0 | | {vbptr}
| +---
8 | +--- (base class D)
8 | | +--- (base class DO)
8 | | | +--- (base class std::enable_shared_from_this<class DO>)
8 | | | | ?$weak_ptr@VDO@@ _Wptr
| | | +---
24 | | | {vbptr}
| | +---
| +---
+---
+--- (virtual base IDO)
32 | {vfptr}
+---
Foo::$vbtable@IA@:
0 | 0
1 | 32 (Food(IA+0)IDO)
Foo::$vbtable@D@:
0 | -16
1 | 8 (Food(DO+16)IDO)
Foo::$vftable@:
| -32
0 | &Foo::{dtor}
1 | &DO::get_identifier
2 | &IDO* Foo::clone
3 | &D* Foo::clone
4 | &Foo* Foo::clone
5 | &Foo* Foo::clone
Foo::clone this adjustor: 32
Foo::{dtor} this adjustor: 32
Foo::__delDtor this adjustor: 32
Foo::__vecDelDtor this adjustor: 32
vbi: class offset o.vbptr o.vbte fVtorDisp
IDO 32 0 4 0
Here it is without:
class Foo size(24):
+---
0 | +--- (base class IA)
0 | | {vbptr}
| +---
8 | +--- (base class D)
8 | | +--- (base class DO)
8 | | | {vbptr}
| | +---
| +---
+---
+--- (virtual base IDO)
16 | {vfptr}
+---
Foo::$vbtable@IA@:
0 | 0
1 | 16 (Food(IA+0)IDO)
Foo::$vbtable@D@:
0 | 0
1 | 8 (Food(DO+0)IDO)
Foo::$vftable@:
| -16
0 | &Foo::{dtor}
1 | &DO::get_identifier
2 | &IDO* Foo::clone
3 | &D* Foo::clone
4 | &Foo* Foo::clone
5 | &Foo* Foo::clone
Foo::clone this adjustor: 16
Foo::{dtor} this adjustor: 16
Foo::__delDtor this adjustor: 16
Foo::__vecDelDtor this adjustor: 16
vbi: class offset o.vbptr o.vbte fVtorDisp
IDO 16 0 4 0
Here's some cleaned-up disassembly of the return-adjusting clone
shim:
mov rcx,qword ptr [this]
call Foo::clone ; the real clone
cmp rax,0 ; null pointer remains null pointer
je fin
add rax,8 ; otherwise, add the offset to the D*
jmp fin
fin: ret
Here's some cleanup-up disassembly of the faulting call:
mov rax,qword ptr [foo]
mov rcx,rax
mov rax,qword ptr [rax] ; load vbptr
movsxd rax,dword ptr [rax+4] ; load offset to IDO subobject
add rcx,rax ; add offset to Foo* to get IDO*
mov rax,qword ptr [rcx] ; load vtbl
call qword ptr [rax+24] ; call function at position 3 (D* clone)
And here's some cleaned-up disassembly of the crashing call:
mov rax,qword ptr [foo_clone]
mov rcx,rax
mov rax,qword ptr [rax] ; load vbptr, loads null in the crashing case
movsxd rax,dword ptr [rax+4] ; load offset to IDO subobject, crashes
add rcx,rax ; add offset to Foo* to get IDO*
mov rax,qword ptr [rcx] ; load vtbl
call qword ptr [rax+8] ; call function at position 1 (get_identifier)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With