Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is virtual table creation thread safe?

Please let me begin with that I know it is a bad practice to call virtual functions from within a constructor/destructor. However, the behavior in doing so, although it might be confusing or not what the user is expecting, is still well defined.

struct Base
{
    Base()
    {
        Foo();
    }
    virtual ~Base() = default;
    virtual void Foo() const
    {
        std::cout << "Base" << std::endl;
    }
};

struct Derived : public Base
{
    virtual void Foo() const
    {
        std::cout << "Derived" << std::endl;
    }
};

int main(int argc, char** argv) 
{
    Base base;
    Derived derived;
    return 0;
}

Output:
Base
Base

Now, back to my real question. What happens if a user calls a virtual function from within the constructor from a different thread. Is there a race condition? Is it undefined? Or put it in other words. Is setting the vtable by the compiler, thread-safe?

Example:

struct Base
{
    Base() :
        future_(std::async(std::launch::async, [this] { Foo(); }))
    {
    }
    virtual ~Base() = default;

    virtual void Foo() const
    {
        std::cout << "Base" << std::endl;
    }

    std::future<void> future_;
};

struct Derived : public Base
{
    virtual void Foo() const
    {
        std::cout << "Derived" << std::endl;
    }
};

int main(int argc, char** argv) 
{
    Base base;
    Derived derived;
    return 0;
}

Output:
?
like image 231
Gils Avatar asked Jun 09 '20 17:06

Gils


People also ask

What happens when virtual table is created?

The vtable is created at compile time. When a new object is created during run time, the hidden vtable pointer is set to point to the vtable. Keep in mind, though, that you can't make reliable use if the virtual functions until the object is fully constructed.

Is virtual table static?

The virtual table is actually quite simple, though it's a little complex to describe in words. First, every class that uses virtual functions (or is derived from a class that uses virtual functions) is given its own virtual table. This table is simply a static array that the compiler sets up at compile time.

Is new thread safe?

The C++ new and delete operators are thread safe, but this means that a thread may have to wait for a lock on these operations. Once memory is obtained for a thread, the thread_alloc memory allocator keeps that memory available for the thread so that it can be re-used without waiting for a lock.


2 Answers

First off a few excerpts from the standard that are relevant in this context:

[defns.dynamic.type]

type of the most derived object to which the glvalue refers [Example: If a pointer p whose static type is "pointer to class B" is pointing to an object of class D, derived from B, the dynamic type of the expression *p is "D". References are treated similarly. — end example]

[intro.object] 6.7.2.1

[..] An object has a type. Some objects are polymorphic; the implementation generates information associated with each such object that makes it possible to determine that object's type during program execution.

[class.cdtor] 11.10.4.4

Member functions, including virtual functions, can be called during construction or destruction. When a virtual function is called directly or indirectly from a constructor or from a destructor, including during the construction or destruction of the class's non-static data members, and the object to which the call applies is the object (call it x ) under construction or destruction, the function called is the final overrider in the constructor's or destructor's class and not one overriding it in a more-derived class. [..]

As you wrote, it is clearly defined how virtual function calls in the constructor/destructor work - they depend on the dynamic type of the object, and the dynamic type information associated with the object, and that information changes in the course of the execution. It is not relevant what kind of pointer you are using to "look at the object". Consider this example:

struct Base {
  Base() {
    print_type(this);
  }

  virtual ~Base() = default;

  static void print_type(Base* obj) {
      std::cout << "obj has type: " << typeid(*obj).name() << std::endl;
  }
};

struct Derived : public Base {
  Derived() {
    print_type(this);
  }
};

print_type always receives a pointer to Base, but when you create an instance of Derived you will see two lines - one with "Base" and one with "Derived". The dynamic type is set at the very beginning of the constructor so you can call a virtual function as part of the member initialization.

It is not specified how or where this information is stored, but it is associated with the object itself.

[..] the implementation generates information associated with each such object [..]

In order to change the dynamic type, this information has to be updated. This may be some data that is introduced by the compiler, but operations on that data are still covered by the memory model:

[intro.memory] 6.7.1.3

A memory location is either an object of scalar type or a maximal sequence of adjacent bit-fields all having nonzero width. [ Note: Various features of the language, such as references and virtual functions, might involve additional memory locations that are not accessible to programs but are managed by the implementation. — end note]

So the information associated with the object is stored and updated in some memory location. But that is were data races happen:

[intro.races]

[..]
Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location.
[..]
The execution of a program contains a data race if it contains two potentially concurrent conflicting actions, at least one of which is not atomic, and neither happens before the other [..]

The update of the dynamic type is not atomic, and since there is no other synchronization that would enforce a happens-before order, this is a data race and therefore UB.

Even if the update were to be atomic, you would still have no guarantee about the state of the object as long as the constructor has not finished, so there is no point of making it atomic.


Update

Conceptually it feels like the object takes on different types during construction and destruction. However, it has been pointed out to me by @LanguageLawyer that the dynamic type of an object (more precisely of a glvalue that refers to that object) corresponds to the most derived type, and this type is clearly defined and does not change. [class.cdtor] also includes a hint about this detail:

[..] the function called is the final overrider in the constructor's or destructor's class and not one overriding it in a more-derived class.

So even though the behavior of virtual function calls and the typeid operator is defined as if the object takes on different types, that is actually not the case.

That said, in order to achieve the specified behavior something in the state of the object (or at least some information associated with that object) has to be changed. And as pointed out in [intro.memory], these additional memory locations are indeed subject of the memory model. So I still stand by my initial assessment that this is a data race.

like image 171
mpoeter Avatar answered Sep 28 '22 13:09

mpoeter


I believe [class.base.init]/16:

Member functions (including virtual member functions) can be called for an object under construction. Similarly, an object under construction can be the operand of the typeid operator or of a dynamic_­cast. However, if these operations are performed in a ctor-initializer (or in a function called directly or indirectly from a ctor-initializer) before all the mem-initializers for base classes have completed, the program has undefined behavior.

should answer the question. However, it is defective. The fix would be

However, if these operations are performed in a ctor-initializer (or in a function called directly or indirectly from a ctor-initializer) before not after all the mem-initializers for base classes have completed, the program has undefined behavior.

Currently, the paragraph says that the behavior is undefined only if the invocation of a member function happens before mem-initializers for base classes have completed, but doesn't cover your case: when the invocation neither happens before base classes initialization completion nor base classes initialization completion happens before the invocation.

like image 30
Language Lawyer Avatar answered Sep 28 '22 11:09

Language Lawyer