Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it undefined behaviour to memcpy from an uninitialized variable?

Tags:

Is using an uninitialized variable as the src for memcpy undefined behaviour in C?

void foo(int *to) {   int from;   memcpy(to, &from, sizeof(from)); } 
like image 338
Tor Klingberg Avatar asked Oct 28 '15 14:10

Tor Klingberg


2 Answers

The C committee proposed response to defect report 451: instability of uninitialized automatic variables is:

The answer to question 3 is that library functions will exhibit undefined behavior when used on indeterminate values.

The question in the defect had sought an exemption for memcpy and fwrite if this was indeed the case saying:

[...] The fact that one wants to be able to copy uninitialized padding bytes in structs using memcpy without undefined behavior is the reason that using the value of an uninitialized object is not undefined behavior. This seems to suggest that an fwrite of a struct with uninitialized padding bytes should not exhibit undefined behavior.

This part of the propose response seems to be aimed at that concern over uninitialized padding:

The committee also notes that padding bytes within structures are possibly a distinct form of "wobbly" representation.

We can see form defect report 338: C99 seems to exclude indeterminate value from being an uninitialized register this is somewhat of a change from past expectations. It says amongst other things:

[...] I believe the intent of excluding type unsigned char from having trap representations was to allow it to be used to copy (via memcpy) arbitrary memory, in the case that memory might contain trap representations for some types.[...]

The blog post Reading indeterminate contents might as well be undefined covers the evolution of reading indeterminate values in C well and make some more sense of the changes I mention above.

It is worth noting this differs from C++ where reading an indeterminate value from a narrow unsigned char is not undefined behavior and defect report 240 notes this difference:

The C committee is dealing with a similar issue in their DR338. According to this analysis, they plan to take almost the opposite approach to the one described above by augmenting the description of their version of the lvalue-to-rvalue conversion. The CWG did not consider that access to an unsigned char might still trap if it is allocated in a register and needs to reevaluate the proposed resolution in that light. See also issue 129.

like image 130
Shafik Yaghmour Avatar answered Oct 01 '22 22:10

Shafik Yaghmour


This is defined behaviour with respect to the action of copying, except if int has a trap representation in your system. Memory was allocated on the stack when int from was defined. The contents of this int is whatever happened to be on that location in the stack at that moment. Therefore the end result, the value of the int that is being copied to to is not defined (indeterminate).

Other answers have quotes from the C standard that undefined behaviour occurs when the value of an uninitialised variable is "used". Which obviously doesn't apply if you don't use the value. There is another mention in the C11 standard undefined behaviour while copying/assigning uninitialised variables :

6.3.2.1p2

If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.

This also doesn't affect your code because the address of from is taken when you call memcpy

Another relevant part of the C11 standard is 6.2.6.1

Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined) Such a representation is called a trap representation.

Some very old processors could have a trap representation for an int either software-visible parity bits or "negative zero" in non-twos-complement architectures. x86 processors for example don't have trap representations for int.

like image 31
Manos Nikolaidis Avatar answered Oct 01 '22 22:10

Manos Nikolaidis