Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it undefined behavior to run a member function in a separate thread, in parallel to the type's constructor?

This is a scenario you shouldn't ever do, but https://timsong-cpp.github.io/cppwp/class.cdtor#4 states:

Member functions, including virtual functions ([class.virtual]), can be called during construction or destruction ([class.base.init]).

Does this hold if the functions are called in parallel? That is, ignoring the race condition, if the A is in the middle of construction, and frobme is called at some point AFTER the constructor is invoked (e.g. during construction), is that still defined behavior?

#include <thread>

struct A {
    void frobme() {}
};

int main() {
    char mem[sizeof(A)];

    auto t1 = std::thread([mem]() mutable { new(mem) A; });
    auto t2 = std::thread([mem]() mutable { reinterpret_cast<A*>(mem)->frobme(); });

    t1.join();
    t2.join();
}

As a separate scenario, it was also pointed out to me that it's possible for A's constructor to create multiple threads, where those those threads may invoke a member function function before A is finished construction, but the ordering of those operations would be more analyzable (you know no races will occur until AFTER the thread is generated in the constructor).

like image 421
Mike Lui Avatar asked Mar 16 '20 20:03

Mike Lui


2 Answers

There are two issues here: your specific code and your general question.

In your specific code, even in the best possible case scenario (where t2 executes after t1), you have a data race due to the lack of synchronization between creation and use. And that makes your code UB regardless of the order of execution.

In the general question, let's assume that the constructor of a type hands the this pointer off to some other thread, which then calls functions on it, and the hand-off itself is properly synchronized. Would some other thread invoking a member function be considered a data race?

Well, it certainly would be a data race if the other thread invokes a function that reads member values or other data written by the constructor subsequent to the point of the hand-off, or if the constructor accesses members or other data written by the member function being invoked. That is, if there are no data races between the code being executed simultaneously.

Assuming that neither of those is the case, then everything should be fine (mostly. It's possible to define A in such a way that your reinterpret_cast doesn't return a usable pointer to the A you created in that storage; you'd need to launder it). An object under construction/destruction can be accessed, but only in certain ways. Stick to those ways, and you should be fine... with one possible catch.

There's nothing in the standard about data races on the completion of an object's initialization, only on conflicts in memory locations. Once the object is fully constructed, the behavior of virtual functions could change, based on changing vtable pointers and such if the dynamic type is a class derived from the class given to the other thread. I don't believe there's a clear statement about this in the section on the object model.

Also, note that C++20 added a special rule to class.cdtor:

During the construction of an object, if the value of the object or any of its subobjects is accessed through a glvalue that is not obtained, directly or indirectly, from the constructor's this pointer, the value of the object or subobject thus obtained is unspecified.

like image 137
Nicol Bolas Avatar answered Oct 12 '22 22:10

Nicol Bolas


Besides the race condition (which you might be managing with mutexes or similar), you're subject to the usual limitations on an object whose lifetime has not yet started, namely:

Before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any pointer that represents the address of the storage location where the object will be or was located may be used but only in limited ways.

See [basic.life] for the full list of operations that are and are not allowed.

In particular, one of the restrictions is that

The program has undefined behavior if:

...

  • the glvalue is used to call a non-static member function of the object

which clearly forbids your example.

Also [class.cdtor] says:

For an object with a non-trivial constructor, referring to any non-static member or base class of the object before the constructor begins execution results in undefined behavior

and even if you do synchronize to some event triggered after construction begins, this rule will forbid that code:

During the construction of an object, if the value of the object or any of its subobjects is accessed through a glvalue that is not obtained, directly or indirectly, from the constructor's this pointer, the value of the object or subobject thus obtained is unspecified

like image 24
Ben Voigt Avatar answered Oct 12 '22 23:10

Ben Voigt