Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do we need "this pointer adjustor thunk"?

Tags:

c++

com

I read about adjustor thunk from here. Here's some quotation:

Now, there is only one QueryInterface method, but there are two entries, one for each vtable. Remember that each function in a vtable receives the corresponding interface pointer as its "this" parameter. That's just fine for QueryInterface (1); its interface pointer is the same as the object's interface pointer. But that's bad news for QueryInterface (2), since its interface pointer is q, not p.

This is where the adjustor thunks come in.

I am wondering why "each function in a vtable receives the corresponding interface pointer as its "this" parameter"? Is it the only clue(base address) used by the interface method to locate data members within the object instance?

Update

Here is my latest understanding:

In fact, my question is not about the purpose of this parameter, but about why we have to use the corresponding interface pointer as the this parameter. Sorry for my vagueness.

Besides using the interface pointer as a locator/foothold within an object's layout. There're of course other means to do that, as long as you are the implementer of the component.

But this is not the case for the clients of our component.

When the component is built in COM way, clients of our component know nothing about the internals of our component. Clients can only take hold of the interface pointer, and this is the very pointer that will be passed into the interface method as the this parameter. Under this expectation, the compiler has no choice but to generate the interface method's code based on this specific this pointer.

So the above reasoning leads to the result that:

it must be assured that each function in a vtable must recieve the corresponding interface pointer as its "this" parameter.

In the case of "this pointer adjustor thunk", 2 different entries exist for a single QueryInterface() method, in other words, 2 different interface pointers could be used to invoke the QueryInterface() method, but the compiler only generate 1 copy of QueryInterface() method. So if one of the interfaces is chosen by the compiler as the this pointer, we need to adjust the other to the chosen one. This is what the this adjustor thunk is born for.

BTW-1, what if the compiler can generate 2 different instances of QueryInterface() method? Each one based on the corresponding interface pointer. This won't need the adjustor thunk, but it would take more space to store the extra but similar code.

BTW-2: it seems that sometimes a question lacks a reasonable explanation from the implementer's point of view, but could be better understood from the user's pointer of view.

like image 671
smwikipedia Avatar asked Aug 14 '10 00:08

smwikipedia


1 Answers

Taking away the COM part from the question, the this pointer adjustor thunk is a piece of code that makes sure that each function gets a this pointer pointing to the subobject of the concrete type. The issue comes up with multiple inheritance, where the base and derived objects are not aligned.

Consider the following code:

struct base {
   int value;
   virtual void foo() { std::cout << value << std::endl; }
   virtual void bar() { std::cout << value << std::endl; }
};
struct offset {
   char space[10];
};
struct derived : offset, base {
   int dvalue;
   virtual void foo() { std::cout << value << "," << dvalue << std::endl; }
};

(And disregard the lack of initialization). The base sub object in derived is not aligned with the start of the object, as there is a offset in between[1]. When a pointer to derived is casted to a pointer to base (including implicit casts, but not reinterpret casts that would cause UB and potential death) the value of the pointer is offsetted so that (void*)d != (void*)((base*)d) for an assumed object d of type derived.

Now condider the usage:

derived d;
base * b = &d; // This generates an offset
b->bar();
b->foo();

The issue comes when a function is called from a base pointer or reference. If the virtual dispatch mechanism finds that the final overrider is in base, then the pointer this must refer to the base object, as in b->bar, where the implicit this pointer is the same address stored in b. Now if the final overrider is in a derived class, as with b->foo() the this pointer has to be aligned with the beginning of the sub object of the type where the final overrider is found (in this case derived).

What the compiler does is creating an intermediate piece of code. When the virtual dispatch mechanism is called, and before dispatching to derived::foo the intermediate call takes the this pointer and substracts the offset to the beginning of the derived object. This operation is the same as a downcast static_cast<derived*>(this). Remember that at this point, the this pointer is of type base, so it was initially offsetted, and this effectively returns the original value &d.

[1]There is an offset even in the case of interfaces --in the Java/C# sense: classes defining only virtual methods-- as they need to store a table to that interface's vtable.

like image 192
David Rodríguez - dribeas Avatar answered Oct 20 '22 16:10

David Rodríguez - dribeas