Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Copy trivially copyable types using temporary storage areas: is it allowed?

This question is a follow up of a comment to an answer of another question.


Consider the following example:

#include <cstring>
#include <type_traits>
#include <cassert>

int main() {
    std::aligned_storage_t<sizeof(void*), alignof(void*)> storage, copy;

    int i = 42;
    std::memcpy(&storage, &i, sizeof(int));

    copy = storage;

    int j{};
    std::memcpy(&j, &copy, sizeof(int));

    assert(j == 42);
}

This works (for some definitions of works). However, the standard tells us this:

For any object (other than a base-class subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes making up the object can be copied into an array of char, unsigned char, or std​::​byte .
If the content of that array is copied back into the object, the object shall subsequently hold its original value. [ Example:

#define N sizeof(T)
char buf[N];
T obj;                          // obj initialized to its original value
std::memcpy(buf, &obj, N);      // between these two calls to std​::​memcpy, obj might be modified
std::memcpy(&obj, buf, N);      // at this point, each subobject of obj of scalar type holds its original value

 — end example ]

And this:

For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a base-class subobject, if the underlying bytes making up obj1 are copied into obj2, obj2 shall subsequently hold the same value as obj1. [ Example:

T* t1p;
T* t2p;
    // provided that t2p points to an initialized object ...
std::memcpy(t1p, t2p, sizeof(T));
    // at this point, every subobject of trivially copyable type in *t1p contains
    // the same value as the corresponding subobject in *t2p

 — end example ]

In any case it mentions that copying a trivially copyable type in a buffer and then copy it back in a new instance of the original type is allowed.
In the example above I do something similar, plus I copy also the buffer in a new buffer (this resembles a bit more the real world case).

In the comments linked at the top of the question, the author says that this behavior is underspecified. On the other side, I cannot see eg how could I send an int over the network and use it on the other end if this isn't allowed (copy an int in a buffer, send it over the network, receive it as a buffer and memcpy it in an instance of int - more or less what I do in the example, without a network in between).

Is this allowed by some other bullets of the standard I missed or is this really underspecified?

like image 257
skypjack Avatar asked Feb 05 '19 09:02

skypjack


2 Answers

It reads fine to me.

You've copied the underlying bytes of obj1 into obj2. Both are trivial and of the same type. The prose you quote permits this explicitly.

The fact that said underlying bytes were temporarily stored in a correctly-sized and correctly-aligned holding area, via an also-explicitly-permitted reinterpretation as char*, doesn't seem to change that. They're still "those bytes". There's no rule that says copying must be "direct" in order to satisfy features like this.

Indeed, this is not only a completely common pattern when dealing with network transfer (conventional use of course doesn't make it right on its own), but also a historically normal thing to do that the standard would be mad not to account for (which gives me all the assurance I need that it is indeed intended).

I can see how there may be doubt, given that the rule is first given for copying those bytes back into the original object, then given again for copying those bytes into a new object. But I can't detect any logical difference between the two circumstances, and therefore find the first quoted wording to be largely redundant. It's possible the author just wanted to be crystal clear that this safety applies identically in both cases.

like image 96
Lightness Races in Orbit Avatar answered Sep 20 '22 13:09

Lightness Races in Orbit


To me, this is one of the most ambiguous issues in C++. Honestly speaking, I never got confused by anything in C++ as much as type punning. There's always a corner case that seems to be not covered (or underspecified, like you put it).

However, conversion from integers to raw memory (char*) is supposed to be allowed for serialization/examination of underlying object.

What's the solution?

Unit tests. That's my solution to the problem. You do what complies most with the standard, and you write basic unit tests that test your particular assumption. Then, whenever you compile a new version or move to a new compiler, you run the unit tests and verify that the compiler does what you expect it to do.

like image 24
The Quantum Physicist Avatar answered Sep 21 '22 13:09

The Quantum Physicist