Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do move constructors and move assignment operators of Standard Library leave the object moved-from in unspecified state?

There is a special description for move constructors and move assignment operators in the C++ Standard Library that says that the object the data is moved from is left in a valid but unspecified state after the call. Why? I frankly don't understand it. It is something I intuitively don't expect. Really, if I move something from one place to another in the real world, the place I move from is left empty (and yep, valid), until I move there something new. Why in the C++ world should it be different?

For example, depending on the implementation the following code:

std::vector<int> a {1, 2, 3};
std::vector<int> b {4, 5, 6};
a = std::move(b);

may be equivalent to the next code:

std::vector<int> a {1, 2, 3};
std::vector<int> b {4, 5, 6};
a.swap(b);

It is really what I don't expect. If I move the data from one vector to another, I expect the vector I move data from to be empty (zero size).

As far as I know the GCC implementation of the Standard C++ library leaves the vector in empty state after the move. Why not to make this behavior a part of the Standard?

What are the reasons to leave an object in unspecified state. If it is for optimization, it is kind of strange too. The only reasonable thing I can do with an object in unspecified state is to clear it (ok, I can get the size of the vector, I can print its content, but since the content is unspecified I don't need it). So the object will be cleared in any way either by me manually or by call of assignment operator or destructor. I prefer to clear it by myself, because I expect it to be cleared. But that's a double call to clear. Where is an optimization?

like image 268
anton_rh Avatar asked Feb 26 '18 14:02

anton_rh


People also ask

Which is true about move operations move constructor move assignment operator?

The move assignment operator is different than a move constructor because a move assignment operator is called on an existing object, while a move constructor is called on an object created by the operation. Thereafter, the other object's data is no longer valid.

What is the purpose of a move constructor?

A move constructor enables the resources owned by an rvalue object to be moved into an lvalue without copying.

What is the difference between the move and copy operator why would you use one or the other?

The subtle difference is, if you create with a copy or move semantic a new object based on an existing one, that the copy semantic will copy the elements of the resource, that the move semantic will move the elements of the resource. Of course, copying is expensive, moving is cheap.

What is advantage of move constructor in C++?

Move constructor moves the resources in the heap, i.e., unlike copy constructors which copy the data of the existing object and assigning it to the new object move constructor just makes the pointer of the declared object to point to the data of temporary object and nulls out the pointer of the temporary objects.


2 Answers

There is a special description for move constructors and move assignment operators in the C++ Standard Library that says that the object the data is moved from is left in a valid but unspecified state after the call. Why? I frankly don't understand it. It is something I intuitively don't expect. Really, if I move something from one place to another in the real world, the place I move from is left empty (and yep, valid), until I move there something new. Why in the C++ world should it be different?

It isn't.

But you're failing to consider that a "move" cannot always be a move. What happens when you move data from a std::array, for example? Not much. Since an array stores its data in-place, there's no pointers to swap, and a move becomes a copy. As such, although the library could destroy the original data, there's not really any point in doing so, and so the standard won't go any further than saying "we don't guarantee what you get".

A real example is a std::string which is currently storing its contents not in a dynamically-allocated block of memory, but in a small automatically allocated block of memory (this is commonly referred to as the small string optimisation). Like an array, there is no way to actually "move" this information; it must be copied. The string could zero it out afterwards, and it could reduce its length counter to zero, but why force that runtime cost on its users?

So, it would be possible to make stronger guarantees about the state of a post-moved container, on a case-by-case basis, but only by artificially constraining implementations (and reducing optimisation opportunities) for frankly no good reason.

Real world analogies can be fun as a thought experiment, but using them to actually rationalise about behaviours of a programming language is folly.

like image 196
Lightness Races in Orbit Avatar answered Sep 26 '22 17:09

Lightness Races in Orbit


What are the reasons to leave an object in unspecified state.

Any class can have a different state that is reasonable to be left behind. "unspecified" means here "to be determined time to time". This state can be just the old value (so the compiler can perform just a cheap swap), if this has not side effects, but in case of vectors or shared_ptr's this state must be empty (see the definitions of the move constructors).

https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#c64-a-move-operation-should-move-and-leave-its-source-in-a-valid-state

When you have applied it in your case, a memory corruption arose. This is explained in the following.

The OP reported his code in a comment at the following links: "wrong example: coliru.stacked-crooked.com/a/8698a44f63084d68, fixed example: coliru.stacked-crooked.com/a/b6e680c8f24b8123, UB version that works fine on gcc: coliru.stacked-crooked.com/a/44f9ab54257e25ec"

The real problem you are facing is that you must never mix shared_ptr's and bare ptr's. In fact you are declaring

auto p_processor = std::make_shared<BackGroundProcessor>();

and then copying one reference of the shared pointer in the function object stored in the thread:

Event ev_done;
p_processor->Run([p_processor, &ev_done]() { ev_done.Set(); });

and than launching the thread by capturing this - that is, you are using it by pointer:

void Run(std::function<void()> on_done)
{
    m_on_done.swap(on_done);
    std::thread([this]()
    {
        // Doing some processing
        ...
        m_on_done = nullptr;

Since the thread will take longer than main(), as you reset the shared_ptr in main(), its ref count become "1". Than in the thread, as soon as m_on_done is reset, the object executed in the thread (that is this itself) get deleted before the thread termination. I believe that this is at the origin of all the non reproducible behaviors that you have met.

One common approach to face this is to use shared_from_this() declaring:

class BackGroundProcessor : public std::enable_shared_from_this<BackGroundProcessor>
{
   ...

(find the full fix here http://coliru.stacked-crooked.com/a/1f5c425696c29011)

Then create a shared_ptr and copy it in the thread-lambda - so it will keep it alive until running:

void Run(std::function<void()> on_done)
{
    auto self = shared_from_this();
    m_on_done.swap(on_done);
    std::thread([this,self]()
    {
        // Doing some processing
        ...

Specifying it in the capture arguments should be enough.

like image 29
Sigi Avatar answered Sep 25 '22 17:09

Sigi