Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

NVI and devirtualisation

If you're using NVI can the compiler devirtualise function calls?

An example:

#include <iostream>

class widget
{
public:
    void foo() { bar(); }

private:
    virtual void bar() = 0;
};

class gadget final : public widget
{
private:
    void bar() override { std::cout << "gadget\n"; }
};

int main()
{
    gadget g;
    g.foo();    // HERE.
}

At the line marked can the compiler devirtualise the call to bar?

like image 343
Simple Avatar asked Feb 16 '23 13:02

Simple


2 Answers

Given that the dynamic type of g is known to be exactly gadget, a compiler can devirtualize the call to bar after inlining foo, regardless of the usage of final on the class gadget declaration or on the declaration of gadget::bar. I'll analyse this similar program that doesn't use iostreams since the output assembly is easier to read:

class widget
{
public:
    void foo() { bar(); }

private:
    virtual void bar() = 0;
};

class gadget : public widget
{
    void bar() override { ++counter; }
public:
    int counter = 0;
};

int test1()
{
    gadget g;
    g.foo();
    return g.counter;
}

int test2()
{
    gadget g;
    g.foo();
    g.foo();
    return g.counter;
}

int test3()
{
    gadget g;
    g.foo();
    g.foo();
    g.foo();
    return g.counter;
}

int test4()
{
    gadget g;
    g.foo();
    g.foo();
    g.foo();
    g.foo();
    return g.counter;
}

int testloop(int n)
{
    gadget g;
    while(--n >= 0)
        g.foo();
    return g.counter;
}

We can determine the success of devirtualization by examining the output assembly: (GCC), (clang). Both optimize test into the equivalent of return 1; - the call is devirtualized and inlined, and the object eliminated. Clang does the same for test2 through test4 - return 2; / 3 / 4 respectively - but GCC seems to gradually lose track of the type information the more times it must perform the optimization. Despite successfully optimizing test1 to the return of a constant, test2 becomes roughly:

int test2() {
    gadget g;
    g.counter = 1;
    g.gadget::bar();
    return g.counter;
}

The first call has been devirtualized and its effect inlined (g.counter = 1), but the second has been only devirtualized. Adding the additional call in test3 results in:

int test3() {
    gadget g;
    g.counter = 1;
    g.gadget::bar();
    g.bar();
    return g.counter;
}

Again the first call is completely inlined, the second only devirtualized, but the third call isn't optimized at all. It's a plain Jane load from the virtual table and indirect function call. The result is the same for the additional call in test4:

int test4() {
    gadget g;
    g.counter = 1;
    g.gadget::bar();
    g.bar();
    g.bar();
    return g.counter;
}

Notably, neither compiler devirtualizes the call in the simple loop of testloop, which they both compile to the equivalent of:

int testloop(int n) {
  gadget g;
  while(--n >= 0)
    g.bar();
  return g.counter;
}

even reloading the vtable pointer from the object on each iteration.

Adding the final marker to both the class gadget declaration and the gadget::bar definition does not affect assembly output generated by either compiler (GCC) (clang).

What does affect the generated assembly is removal of the NVI. This program:

class widget
{
public:
    virtual void bar() = 0;
};

class gadget : public widget
{
public:
    void bar() override { ++counter; }
    int counter = 0;
};

int test1()
{
    gadget g;
    g.bar();
    return g.counter;
}

int test2()
{
    gadget g;
    g.bar();
    g.bar();
    return g.counter;
}

int test3()
{
    gadget g;
    g.bar();
    g.bar();
    g.bar();
    return g.counter;
}

int test4()
{
    gadget g;
    g.bar();
    g.bar();
    g.bar();
    g.bar();
    return g.counter;
}

int testloop(int n)
{
    gadget g;
    while(--n >= 0)
        g.bar();
    return g.counter;
}

is completely optimized by both compilers (GCC) (clang) into the equivalent of:

int test1()
{ return 1; }

int test2()
{ return 2; }

int test3()
{ return 3; }

int test4()
{ return 4; }

int testloop(int n)
{ return n >= 0 ? n : 0; }

To conclude, despite the fact that compilers can devirtualize the calls to bar, they may not always do so in the presence of NVI. Application of the optimization is imperfect in current compilers.

like image 183
Casey Avatar answered Feb 19 '23 00:02

Casey


In theory yes - but that has nothing to do with NVI or not. In your example, the compiler could theoretically de-virtualize a call g.bar() as well. The only thing the compiler needs to know is whether the object is really of type gadget or might it be something else. If the compiler can deduct that it can only be of type g, it could de-virtualize the call.

But probably, most compiler won't try.

like image 30
Tobias Langner Avatar answered Feb 19 '23 02:02

Tobias Langner