C++ supports dynamic binding through virtual mechanism. But as I understand the virtual mechanism is an implementation detail of the compiler and the standard just specifies the behaviors of what should happen under specific scenarios. Most compilers implement the virtual mechanism through the virtual table and virtual pointer. This is not about implementation detail of virtual pointers and table. My questions are:
sizeof
of any class with just a virtual function will be size of an pointer (vptr inside this
) on that compiler. So given that virtual pointer and TBL mechanism itself is compiler implementation, will this statement I made above be always true?Yes, that's fine ... you only need to implement any pure virtual functions in order to instantiate a class derived from an abstract base class. Show activity on this post. Yes, Its correct that a Derived class has to OVERRIDE the function which is Pure Virtual in the Parent Class.
A virtual function allows derived classes to replace the implementation provided by the base class. The compiler makes sure the replacement is always called whenever the object in question is actually of the derived class, even if the object is accessed by a base pointer rather than a derived pointer.
We use virtual functions to ensure that the correct function is called for an object, regardless of the reference type used to call the function. They are basically used to achieve the runtime polymorphism and are declared in the base class by using the virtual keyword before the function.
The virtual keyword can be used when declaring overriding functions in a derived class, but it is unnecessary; overrides of virtual functions are always virtual. Virtual functions in a base class must be defined unless they are declared using the pure-specifier.
It is not true that vtable pointers in objects are always the most efficient. My compiler for another language used to use in-object pointers for similar reasons but no longer does: instead it uses a separate data structure which maps the object address to the required meta-data: in my system this happens to be shape information for use by the garbage collector.
This implementation costs a bit more storage for a single simple object, is more efficient for complex objects with many bases, and it is vastly more efficient for arrays, since only a single entry is required in the mapping table for all objects in the array. My particular implementation can also find the meta-data given a pointer to any point interior to the object.
The actual lookup is extremely fast, and the storage requirements very modest, because I am using the best data structure on the planet: Judy arrays.
I also know of no C++ compiler using anything other than vtable pointers, but it is not the only way. In fact, the initialisation semantics for classes with bases make any implementation messy. This is because the complete type has to see-saw around as the object is constructed. As a consequence of these semantics, complex mixin objects lead to massive sets of vtables being generated, large objects, and slow object initialisation. This probably isn't a consequence of the vtable technique as much as needing to slavishly follow the requirement that the run-time type of a subobject be correct at all times. Actually there's no good reason for this during construction, since constructors are not methods and can't sensibly use virtual dispatch: this isn't so clear to me for destruction since destructors are real methods.
To my knowledge, all C++ implementations use a vtable pointer, although it would be quite easy (and perhaps not so bad perf wise as you might think given caches) to keep a small type-index in the object (1-2 B) and subsequently obtain the vtable and type information with a small table lookup.
Another interesting approach might be BIBOP (http://foldoc.org/BIBOP) -- big bag of pages -- although it would have issues for C++. Idea: put objects of the same type on a page. Get a pointer to the type descriptor / vtable at the top of the page by simply and'ing off the less signficant bits of the object pointer. (Doesn't work well for objects on the stack, of course!)
Another other approach is to encode certain type tags/indices in the object pointers themselves. For example, if by construction all objects are 16-byte aligned, you can use the 4 LSBs to put a 4-bit type tag in there. (Not really enough.) Or (particularly for embedded systems) if you have guaranteed unused more-significant-bits in addresses, you can put more tag bits up there, and recover them with a shift and mask.
While both these schemes are interesting (and sometimes used) for other language implementations, they are problematic for C++. Certain C++ semantics, such as which base class virtual function overrides are called during (base class) object construction and destruction, drive you to a model where there is some state in the object that you modify as you enter base class ctors/dtors.
You may find my old tutorial on the Microsoft C++ object model implementation interesting. http://www.openrce.org/articles/files/jangrayhood.pdf
Happy hacking!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With