Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why would the behavior of std::memcpy be undefined for objects that are not TriviallyCopyable?

From http://en.cppreference.com/w/cpp/string/byte/memcpy:

If the objects are not TriviallyCopyable (e.g. scalars, arrays, C-compatible structs), the behavior is undefined.

At my work, we have used std::memcpy for a long time to bitwise swap objects that are not TriviallyCopyable using:

void swapMemory(Entity* ePtr1, Entity* ePtr2) {    static const int size = sizeof(Entity);     char swapBuffer[size];     memcpy(swapBuffer, ePtr1, size);    memcpy(ePtr1, ePtr2, size);    memcpy(ePtr2, swapBuffer, size); } 

and never had any issues.

I understand that it is trivial to abuse std::memcpy with non-TriviallyCopyable objects and cause undefined behavior downstream. However, my question:

Why would the behavior of std::memcpy itself be undefined when used with non-TriviallyCopyable objects? Why does the standard deem it necessary to specify that?

UPDATE

The contents of http://en.cppreference.com/w/cpp/string/byte/memcpy have been modified in response to this post and the answers to the post. The current description says:

If the objects are not TriviallyCopyable (e.g. scalars, arrays, C-compatible structs), the behavior is undefined unless the program does not depend on the effects of the destructor of the target object (which is not run by memcpy) and the lifetime of the target object (which is ended, but not started by memcpy) is started by some other means, such as placement-new.

PS

Comment by @Cubbi:

@RSahu if something guarantees UB downstream, it renders the entire program undefined. But I agree that it appears to be possible to skirt around UB in this case and modified cppreference accordingly.

like image 436
R Sahu Avatar asked Apr 21 '15 16:04

R Sahu


People also ask

Where is memcpy defined in C++?

The memcpy() function in C++ copies specified bytes of data from the source to the destination. It is defined in the cstring header file.

What does failed memcpy return?

The memcpy() function shall return s1; no return value is reserved to indicate an error.

What can I use instead of memcpy in C++?

the memcpy function is used. In C++, the STL can be used (std::copy).

Will memcpy overwrite?

Description: The memcpy function copies n bytes from ct to s. If these memory buffers overlap, the memcpy function cannot guarantee that bytes in ct are copied to s before being overwritten. If these buffers do overlap, use the memmove function.


2 Answers

Why would the behavior of std::memcpy itself be undefined when used with non-TriviallyCopyable objects?

It's not! However, once you copy the underlying bytes of one object of a non-trivially copyable type into another object of that type, the target object is not alive. We destroyed it by reusing its storage, and haven't revitalized it by a constructor call.

Using the target object - calling its member functions, accessing its data members - is clearly undefined[basic.life]/6, and so is a subsequent, implicit destructor call[basic.life]/4 for target objects having automatic storage duration. Note how undefined behavior is retrospective. [intro.execution]/5:

However, if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation).

If an implementation spots how an object is dead and necessarily subject to further operations that are undefined, ... it may react by altering your programs semantics. From the memcpy call onward. And this consideration gets very practical once we think of optimizers and certain assumptions that they make.

It should be noted that standard libraries are able and allowed to optimize certain standard library algorithms for trivially copyable types, though. std::copy on pointers to trivially copyable types usually calls memcpy on the underlying bytes. So does swap.
So simply stick to using normal generic algorithms and let the compiler do any appropriate low-level optimizations - this is partly what the idea of a trivially copyable type was invented for in the first place: Determining the legality of certain optimizations. Also, this avoids hurting your brain by having to worry about contradictory and underspecified parts of the language.

like image 175
Columbo Avatar answered Oct 26 '22 03:10

Columbo


It is easy enough to construct a class where that memcpy-based swap breaks:

struct X {     int x;     int* px; // invariant: always points to x     X() : x(), px(&x) {}     X(X const& b) : x(b.x), px(&x) {}     X& operator=(X const& b) { x = b.x; return *this; } }; 

memcpying such object breaks that invariant.

GNU C++11 std::string does exactly that with short strings.

This is similar to how the standard file and string streams are implemented. The streams eventually derive from std::basic_ios which contains a pointer to std::basic_streambuf. The streams also contain the specific buffer as a member (or base class sub-object), to which that pointer in std::basic_ios points to.

like image 20
Maxim Egorushkin Avatar answered Oct 26 '22 02:10

Maxim Egorushkin