Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using memcpy to copy an int into a char array and then printing its members: undefined behaviour?

Consider the following code:

int i = 1;
char c[sizeof (i)];
memcpy(c, &i, sizeof (i));
cout << static_cast<int>(c[0]);

Please ignore whether this is good code. I know the output depends on the endianness of the system. This is only an academic question.

Is this code:

  • Undefined behaviour
  • Implementation-defined behaviour
  • Well-defined behaviour
  • Something else
like image 310
Neil Kirk Avatar asked Mar 20 '15 18:03

Neil Kirk


3 Answers

The language does not say that doing this is immediately undefined behavior. It simply says that the representation of c[0] might end up being invalid (trap) representation, in which case the behavior is indeed undefined. But in cases when c[0] is not a trap representation, the behavior is implementation-defined.

If you use unsigned char array, trap representation becomes impossible and behavior becomes purely implementation-defined.

like image 85
AnT Avatar answered Nov 13 '22 05:11

AnT


The rule you are looking for is 3.9p4:

The object representation of an object of type T is the sequence of N unsigned char objects taken up by the object of type T, where N equals sizeof(T). The value representation of an object is the set of bits that hold the value of type T. For trivially copyable types, the value representation is a set of bits in the object representation that determines a value, which is one discrete element of an implementation-defined set of values.

So if you use unsigned char, you do get implementation-defined behavior (any conforming implementation must give you a guarantee on what that behavior is).

Reading through char is also legal, but then the values are unspecified. You are however guaranteed that using unqualified char will preserve the value (therefore bare char cannot have trap representations or padding bits), according to 3.9p2:

For any object (other than a base-class subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes (1.7) making up the object can be copied into an array of char or unsigned char. If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value.

("unspecified" values are a bit weaker than "implementation-defined" values -- the semantics are the same but the platform is not required to document what the values are.)

like image 21
Ben Voigt Avatar answered Nov 13 '22 07:11

Ben Voigt


It is clearly implementation defined behaviour.

The internal representation of an int is not defined by the standard (implementations can choose little or big endian or whatever else), so it cannot be well defined behaviour : the result is allowed to be different on different architectures.

On a defined system (architecture and C compiler and (eventually) configuration) the behaviour is perfectly determined : on a big endian, you will get a 1, on a little endian a 0. So it is implementation defined behaviour.

like image 20
Serge Ballesta Avatar answered Nov 13 '22 06:11

Serge Ballesta