Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the no-undefined-behavior way of deserializing an object from a byte array in C++11 (or later)?

To overcome alignment issues, I need to memcpy into a temporary. What type should that temporary be? gcc complains that the following reinterpret_cast will break strict aliasing rules:

template <typename T>
T deserialize(char *ptr) {
    static_assert(std::is_trivially_copyable<T>::value, "must be trivially copyable");
    alignas(T) char raw[sizeof(T)];
    memcpy(raw, ptr, sizeof(T));
    return *reinterpret_cast<T *>(raw);
}

(e.g. when T is "long").

I don't want to define a T, since I don't want to construct a T before overwriting it.

In a union, doesn't writing one member then reading another count as undefined behavior?

template<typename T>
T deserialize(char *ptr) {
    union {
        char arr[sizeof(T)];
        T obj;
    } u;

    memcpy(u.arr, ptr, sizeof(T));   // Write to u.arr
    return u.obj;   // Read from u.obj, even though arr is the active member.
}
like image 258
Martin C. Martin Avatar asked Sep 02 '16 14:09

Martin C. Martin


People also ask

What is undefined behavior in C?

Undefined Behavior in C and C++ So, in C/C++ programming, undefined behavior means when the program fails to compile, or it may execute incorrectly, either crashes or generates incorrect results, or when it may fortuitously do exactly what the programmer intended.

What causes undefined behavior in C?

In C the use of any automatic variable before it has been initialized yields undefined behavior, as does integer division by zero, signed integer overflow, indexing an array outside of its defined bounds (see buffer overflow), or null pointer dereferencing.


1 Answers

What you want is this:

T result;
char * p = reinterpret_cast<char *>(&result);   // or std::addressof(result) !

std::memcpy(p, ptr, sizeof(T));                 // or std::copy!!

return result;

No aliasing violation. If you want a T, you need to have a T. If your type is trivially copyable, then hopefully it is also trivially constructible and there is no cost. In any event, you have to copy the return operand out into the function return value, and that copy is elided, so there's really no extra cost here.

like image 54
Kerrek SB Avatar answered Oct 05 '22 11:10

Kerrek SB