This question is a kind of followup to eliminate unused virtual functions, which does not go deep enough for my interest.
The problem: When defining classes that have virtual functions, the compiler allocates storage for the virtual function table, and stores pointers to the functions in the table. This causes the linker to keep the code of those functions, regardless of whether they are ever called. This could potentially cause a lot of dead code to be retained in the executable, even when the compiler optimization settings demand elimination of dead code.
Now, if nowhere in the executable there is a call of a particular virtual function (or, in other words, an access to the respective slot of the virtual function table), the corresponding function pointer could be omitted from the virtual function table, and the linker would remove the function's code, with possible further omissions of other code that becomes unreferenced.
Obviously, this can't be done by the compiler, since it only becomes clear at link time whether a particular virtual function is called (assuming static linking - it is clear that it can't be done with dynamic linking). I'm not familiar enough with linkers in order to tell whether the compiler can emit virtual function tables in such a way that the linker can selectively elide individual unused entries in the table.
Basically, my train of thought is this: A function pointer in a virtual function table is a reference to a function which the linker uses to determine that the function's code needs to be retained in the executable. In a similar way, a virtual function call is a reference to a particular slot in all virtual function tables that derive from the class whose virtual function is getting called. Could this kind of referencing be communicated to the linker in such a way that it can elide a virtual function table slot when there are zero references to it?
Note that this isn't the same as replacing a virtual function call with a direct call when the compiler can determine the call target at compile time. I know that some compilers can do that, but that's a different case because the function actually gets called, and it is the overhead of virtual function dispatch that is removed. In my case I want the entire code removed for functions that aren't called.
If I had control over all class definitions, I could manually eliminate all virtual functions which aren't called. But that is unrealistic when using libraries.
Is this something that can be done with "link time optimization" or "whole program optimization"? Are there compilers which successfully do that?
Whenever the class has at least one virtual function. Having virtual functions indicate that a class is meant to act as an interface to derived classes, and when it is, an object of a derived class may be destroyed through a pointer to the base. For example: class Base {
You can override virtual functions defined in a base class from the Visual Studio Properties window.
If you don't use virtual functions, you don't understand OOP yet. Because the virtual function is intimately bound with the concept of type, and type is at the core of object-oriented programming, there is no analog to the virtual function in a traditional procedural language.
So polymorphic behaviour works even when a virtual function is called inside a non-virtual function. The output can be guessed from the fact that the function to be called is decided at run-time using the vptr and vtable.
The problem with the dead code is that the compiler cannot possibly be sure that the code is dead from the perspective of dynamic libraries. An executable can dynamically include a library that uses the dead code (derives from classes owning the dead code).
In addition to that, changing the structure of the v-table during link-time might work perfectly fine if the executable is the only one making function calls. However, if a dynamic library makes any calls, it will have a different understanding of the v-table and it will call the wrong function.
Because of these facts, and on face value not much (if any) performance is gained, optimising linkers are very unlikely to have this feature.
De-virtualisation of virtual functions is actually related to this, and safe optimising linkers can only de-virtualise a very small number of function calls. For instance, it can only de-virtualise the function if it can guarantee that no dynamic library can play any part in the callstack.
edit @curiousguy has brought up a case where the compiler is able to be a bit more liberal with optimising, and that is when the linker can know that no external code knows about the class. An example of this is a class with file scope.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With