Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dissassembling virtual methods in multiple inheritance. How is the vtable working?

Assuming the following C++ source file:

#include <stdio.h>

class BaseTest {
  public:
  int a;

  BaseTest(): a(2){}

  virtual int gB() {
    return a;
  };
};

class SubTest: public BaseTest {
  public:
  int b;

  SubTest(): b(4){}
};

class TriTest: public BaseTest {
  public:
  int c;
  TriTest(): c(42){}
};

class EvilTest: public SubTest, public TriTest {
  public:
  virtual int gB(){
    return b;
  }
};

int main(){
  EvilTest * t2 = new EvilTest;

  TriTest * t3 = t2;

  printf("%d\n",t3->gB());
  printf("%d\n",t2->gB());
  return 0;
}

-fdump-class-hierarchy gives me:

[...]
Vtable for EvilTest
EvilTest::_ZTV8EvilTest: 6u entries
0     (int (*)(...))0
8     (int (*)(...))(& _ZTI8EvilTest)
16    (int (*)(...))EvilTest::gB
24    (int (*)(...))-16
32    (int (*)(...))(& _ZTI8EvilTest)
40    (int (*)(...))EvilTest::_ZThn16_N8EvilTest2gBEv

Class EvilTest
   size=32 align=8
   base size=32 base align=8
EvilTest (0x0x7f1ba98a8150) 0
    vptr=((& EvilTest::_ZTV8EvilTest) + 16u)
  SubTest (0x0x7f1ba96df478) 0
      primary-for EvilTest (0x0x7f1ba98a8150)
    BaseTest (0x0x7f1ba982ba80) 0
        primary-for SubTest (0x0x7f1ba96df478)
  TriTest (0x0x7f1ba96df4e0) 16
      vptr=((& EvilTest::_ZTV8EvilTest) + 40u)
    BaseTest (0x0x7f1ba982bae0) 16
        primary-for TriTest (0x0x7f1ba96df4e0)

Disassembly shows:

34  int main(){
   0x000000000040076d <+0>: push   rbp
   0x000000000040076e <+1>: mov    rbp,rsp
   0x0000000000400771 <+4>: push   rbx
   0x0000000000400772 <+5>: sub    rsp,0x18

35    EvilTest * t2 = new EvilTest;
   0x0000000000400776 <+9>: mov    edi,0x20
   0x000000000040077b <+14>:    call   0x400670 <_Znwm@plt>
   0x0000000000400780 <+19>:    mov    rbx,rax
   0x0000000000400783 <+22>:    mov    rdi,rbx
   0x0000000000400786 <+25>:    call   0x4008a8 <EvilTest::EvilTest()>
   0x000000000040078b <+30>:    mov    QWORD PTR [rbp-0x18],rbx

36    
37    TriTest * t3 = t2;
   0x000000000040078f <+34>:    cmp    QWORD PTR [rbp-0x18],0x0
   0x0000000000400794 <+39>:    je     0x4007a0 <main()+51>
   0x0000000000400796 <+41>:    mov    rax,QWORD PTR [rbp-0x18]
   0x000000000040079a <+45>:    add    rax,0x10
   0x000000000040079e <+49>:    jmp    0x4007a5 <main()+56>
   0x00000000004007a0 <+51>:    mov    eax,0x0
   0x00000000004007a5 <+56>:    mov    QWORD PTR [rbp-0x20],rax

38    
39    printf("%d\n",t3->gB());
   0x00000000004007a9 <+60>:    mov    rax,QWORD PTR [rbp-0x20]
   0x00000000004007ad <+64>:    mov    rax,QWORD PTR [rax]
   0x00000000004007b0 <+67>:    mov    rax,QWORD PTR [rax]
   0x00000000004007b3 <+70>:    mov    rdx,QWORD PTR [rbp-0x20]
   0x00000000004007b7 <+74>:    mov    rdi,rdx
   0x00000000004007ba <+77>:    call   rax
   0x00000000004007bc <+79>:    mov    esi,eax
   0x00000000004007be <+81>:    mov    edi,0x400984
   0x00000000004007c3 <+86>:    mov    eax,0x0
   0x00000000004007c8 <+91>:    call   0x400640 <printf@plt>

40    printf("%d\n",t2->gB());
   0x00000000004007cd <+96>:    mov    rax,QWORD PTR [rbp-0x18]
   0x00000000004007d1 <+100>:   mov    rax,QWORD PTR [rax]
   0x00000000004007d4 <+103>:   mov    rax,QWORD PTR [rax]
   0x00000000004007d7 <+106>:   mov    rdx,QWORD PTR [rbp-0x18]
   0x00000000004007db <+110>:   mov    rdi,rdx
   0x00000000004007de <+113>:   call   rax
   0x00000000004007e0 <+115>:   mov    esi,eax
   0x00000000004007e2 <+117>:   mov    edi,0x400984
   0x00000000004007e7 <+122>:   mov    eax,0x0
   0x00000000004007ec <+127>:   call   0x400640 <printf@plt>

41    return 0;
   0x00000000004007f1 <+132>:   mov    eax,0x0

42  }
   0x00000000004007f6 <+137>:   add    rsp,0x18
   0x00000000004007fa <+141>:   pop    rbx
   0x00000000004007fb <+142>:   pop    rbp
   0x00000000004007fc <+143>:   ret

Now that you've had suitable time to recover from the deadly diamond in the first code block, the actual question.

When t3->gB() is called I see the following disas (t3 is type TriTest, gB() is virtual method EvilTest::gB() ):

   0x00000000004007a9 <+60>:    mov    rax,QWORD PTR [rbp-0x20]
   0x00000000004007ad <+64>:    mov    rax,QWORD PTR [rax]
   0x00000000004007b0 <+67>:    mov    rax,QWORD PTR [rax]
   0x00000000004007b3 <+70>:    mov    rdx,QWORD PTR [rbp-0x20]
   0x00000000004007b7 <+74>:    mov    rdi,rdx
   0x00000000004007ba <+77>:    call   rax

The first mov moves the vtable into rax, the next dereferences it (Now we're in the vtable)

The one after that dereferences that to get a pointer to the function and at the bottom of that paste it's called.

So far so good, but this brings a few questions.

Where's this?
I presume this is loaded into rdi via the movs at +70 and +74, but that's the same pointer as the vtable which means it's a pointer to a TriTest class which shouldn't have the SubTests b member at all. Does the linux thiscall convention handle virtual casting inside the called method as opposed to outside?

This was answered by rodrigo here

How do I disassemble the virtual method?
If I knew this I could answer the previous question myself. disas EvilTest::gB gives me:

Cannot reference virtual member function "gB"

setting a breakpoint before the call, running info reg rax and disassing that gives me:

(gdb) info reg rax
rax            0x4008a1 4196513
(gdb) disas 0x4008a14196513
No function contains specified address.
(gdb) disas *0x4008a14196513
Cannot access memory at address 0x4008a14196513

Why are the vtables (apparently) only 8 bytes away from eachother?
The fdump says there are 16 bytes between the first and second &vtable (Which fits the 64bit pointer and 2 ints) but the dissasembly from the second gB() call is:

   0x00000000004007cd <+96>:    mov    rax,QWORD PTR [rbp-0x18]
   0x00000000004007d1 <+100>:   mov    rax,QWORD PTR [rax]
   0x00000000004007d4 <+103>:   mov    rax,QWORD PTR [rax]
   0x00000000004007d7 <+106>:   mov    rdx,QWORD PTR [rbp-0x18]
   0x00000000004007db <+110>:   mov    rdi,rdx
   0x00000000004007de <+113>:   call   rax

[rbp-0x18] is only 8 bytes away from the previous call ([rbp-0x20]). What's going on?

Answered by 500 in the comments

I forgot the objects were heap allocated, only their pointers are on the stack

like image 204
J V Avatar asked May 07 '14 22:05

J V


People also ask

How does the vtable work?

For every class that contains virtual functions, the compiler constructs a virtual table, a.k.a vtable. The vtable contains an entry for each virtual function accessible by the class and stores a pointer to its definition. Only the most specific function definition callable by the class is stored in the vtable.

What is use of vtable in in inheritance?

You can imagine what happens when you perform inheritance and override some of the virtual functions. The compiler creates a new VTABLE for your new class, and it inserts your new function addresses using the base-class function addresses for any virtual functions you don't override.

How is vtable created?

A vtable is created when a class declaration contains a virtual function. A vtable is introduced when a parent -- anywhere in the heirarchy -- has a virtual function, lets call this parent Y. Any parent of Y WILL NOT have a vtable (unless they have a virtual for some other function in their heirarchy).

How virtual keyword works in the backend concept of vtable and _vptr?

Working of virtual functions (concept of VTABLE and VPTR)If object of that class is created then a virtual pointer (VPTR) is inserted as a data member of the class to point to VTABLE of that class. For each new object created, a new virtual pointer is inserted as a data member of that class.


1 Answers

Disclaimer: I'm no expert in the GCC internal, but I'll try to explain what I think is going on. Also note that you are not using virtual inheritance, but plain multiple inheritance, so your EvilTest object actually contains two BaseTest subobjects. You can see that is the case by trying to use this->a in EvilTest: you'll get an ambiguous reference error.

First of all be aware that every VTable has 2 values in the negative offsets:

  • -2: the this offset (more on this later).
  • -1: pointer to run-time type information for this class.

Then, from 0 on, there will be the pointers to virtual functions:

With that in mind, I'll write the VTable of the classes, with easy to read names:

VTable for BaseTest:

[-2]: 0
[-1]: typeof(BaseTest)
[ 0]: BaseTest::gB

VTable for SubTest:

[-2]: 0
[-1]: typeof(SubTest)
[ 0]: BaseTest::gB

VTable for TriTest

[-2]: 0
[-1]: typeof(TriTest)
[ 0]: BaseTest::gB

Up until this point nothing too interesting.

VTable for EvilTest

[-2]: 0
[-1]: typeof(EvilTest)
[ 0]: EvilTest::gB
[ 1]: -16
[ 2]: typeof(EvilTest)
[ 3]: EvilTest::thunk_gB

Now that is interesting! It is easier to see it working:

EvilTest * t2 = new EvilTest;
t2->gB();

This code calls the function at VTable[0], that is simply EvilTest::gB and all goes fine.

But then you do:

TriTest * t3 = t2;

Since TriTest is not the first base class of EvilTest, the actual binary value of t3 is different from that of t2. That is, the cast advances the pointer N bytes. The exact amount is known by the compiler at compile time, because it depends only on the static types of the expressions. In your code it is 16 bytes. Note that if the pointer is NULL, then it must not be advanced, thus the branch in the disassembler.

At this point is interesting to see the memory layout of the EvilTest object:

[ 0]: pointer to VTable of EvilTest-as-BaseTest
[ 1]: BaseTest::a
[ 2]: SubTest::b
[ 3]: pointer to VTable of EvilTest-as-TriTest
[ 4]: BaseTest::a
[ 5]: TriTest::c

As you can see, when you cast a EvilTest* to a TriTest* you have to advance this to the element [3], that is 8+4+4 = 16 bytes in a 64-bit system.

t3->gB();

Now you use that pointer to call the gB(). That is done using the element [0] of the VTable, as before. But since that function is actually from EvilTest, the this pointer must be moved back 16 bytes before EvilTest::gB() can be called. That is the work of EvilTest::thunk_gB(), this is a little function that reads the VTable[-1] value and substract that value to this. Now everything matches!

It is worth noting that the full VTable of EvilTest is the concatenation of the VTable of EvilTest-as-BaseTest plus the VTable of EvilTest-as-TriTest.

like image 192
rodrigo Avatar answered Nov 12 '22 00:11

rodrigo