Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is converting between pointer-to-T, array-of-T and pointer-to-array-of-T ever undefined behaviour?

Consider the following code.

#include <stdio.h>
int main() {
 typedef int T;
 T a[] = { 1, 2, 3, 4, 5, 6 };
 T(*pa1)[6] = (T(*)[6])a;
 T(*pa2)[3][2] = (T(*)[3][2])a;
 T(*pa3)[1][2][3] = (T(*)[1][2][3])a;
 T *p = a;
 T *p1 = *pa1;
 //T *p2 = *pa2; //error in c++
 //T *p3 = *pa3; //error in c++
 T *p2 = **pa2;
 T *p3 = ***pa3;
 printf("%p %p %p %p %p %p %p\n", a, pa1, pa2, pa3, p, p1, p2, p3);
 printf("%d %d %d %d %d %d %d\n", a[5], (*pa1)[5], 
   (*pa2)[2][1], (*pa3)[0][1][2], p[5], p1[5], p2[5], p3[5]);
 return 0;
}

The above code compiles and runs in C, producing the expected results. All the pointer values are the same, as are all the int values. I think the result will be the same for any type T, but int is the easiest to work with.

I confessed to being initially surprised that dereferencing a pointer-to-array yields an identical pointer value, but on reflection I think that is merely the converse of the array-to-pointer decay we know and love.

[EDIT: The commented out lines trigger errors in C++ and warnings in C. I find the C standard vague on this point, but this is not the real question.]

In this question, it was claimed to be Undefined Behaviour, but I can't see it. Am I right?

Code here if you want to see it.


Right after I wrote the above it dawned on me that those errors are because there is only one level of pointer decay in C++. More dereferencing is needed!

 T *p2 = **pa2; //no error in c or c++
 T *p3 = ***pa3; //no error in c or c++

And before I managed to finish this edit, @AntonSavin provided the same answer. I have edited the code to reflect these changes.

like image 865
david.pfx Avatar asked Aug 28 '14 01:08

david.pfx


People also ask

What is difference between array of pointers and pointer to array?

A user creates a pointer for storing the address of any given array. A user creates an array of pointers that basically acts as an array of multiple pointer variables. It is alternatively known as an array pointer. These are alternatively known as pointer arrays.

Can a pointer act as an array?

Technically speaking, an array name never acts as a pointer. An expression with array type (which could be an array name) will convert to a pointer anytime an array type is not legal, but a pointer type is. And the declaration of an array as a function parameter is converted into a declaration of a pointer.

Which is faster array or pointer and why?

Originally Answered: why is pointer indexing faster than array indexing? It's straight forward that array will always will be faster. Because memory allocation of array is continuous. So accessing array is much faster compare to pointer where memory allocation might or might not be continuous.


2 Answers

This is a C-only answer.

C11 (n1570) 6.3.2.3 p7

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned*) for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.

*) In general, the concept “correctly aligned” is transitive: if a pointer to type A is correctly aligned for a pointer to type B, which in turn is correctly aligned for a pointer to type C, then a pointer to type A is correctly aligned for a pointer to type C.

The standard is a little vague what happens if we use such a pointer (strict aliasing aside) for anything else than converting it back, but the intent and wide-spread interpretation is that such pointers should compare equal (and have the same numerical value, e.g. they should also be equal when converted to uintptr_t), as an example, think about (void *)array == (void *)&array (converting to char * instead of void * is explicitly guaranteed to work).

T(*pa1)[6] = (T(*)[6])a;

This is fine, the pointer is correctly aligned (it’s the same pointer as &a).

T(*pa2)[3][2] = (T(*)[3][2])a; // (i)
T(*pa3)[1][2][3] = (T(*)[1][2][3])a; // (ii)

Iff T[6] has the same alignment requirements as T[3][2], and the same as T[1][2][3], (i), and (ii) are safe, respectively. To me, it sounds strange, that they couldn’t, but I cannot find a guarantee in the standard that they should have the same alignment requirements.

T *p = a; // safe, of course
T *p1 = *pa1; // *pa1 has type T[6], after lvalue conversion it's T*, OK
T *p2 = **pa2; // **pa2 has type T[2], or T* after conversion, OK
T *p3 = ***pa3; // ***pa3, has type T[3], T* after conversion, OK

Ignoring the UB caused by passing int * where printf expects void *, let’s look at the expressions in the arguments for the next printf, first the defined ones:

a[5] // OK, of course
(*pa1)[5]
(*pa2)[2][1]
(*pa3)[0][1][2]
p[5] // same as a[5]
p1[5]

Note, that strict aliasing isn’t a problem here, no wrongly-typed lvalue is involved, and we access T as T.

The following expressions depend on the interpretation of out-of-bounds pointer arithmetic, the more relaxed interpretation (allowing container_of, array flattening, the “struct hack” with char[], etc.) allows them as well; the stricter interpretation (allowing a reliable run-time bounds-checking implementation for pointer arithmetic and dereferencing, but disallowing container_of, array flattening (but not necessarily array “lifting”, what you did), the struct hack, etc.) renders them undefined:

p2[5] // UB, p2 points to the first element of a T[2] array
p3[5] // UB, p3 points to the first element of a T[3] array
like image 114
mafso Avatar answered Oct 23 '22 09:10

mafso


The only reason your code compiles in C is that your default compiler setup allows the compiler to implicitly perform some illegal pointer conversions. Formally, this is not allowed by C language. These lines

T *p2 = *pa2;
T *p3 = *pa3;

are ill-formed in C++ and produce constraint violations in C. In casual parlance, these lines are errors in both C and C++ languages.

Any self-respecting C compiler will issue (is actually required to issue) diagnostic messages for these constraint violations. GCC compiler, for one example, will issue "warnings" telling you that pointer types in the above initializations are incompatible. While "warnings" are perfectly sufficient to satisfy standard requirements, if you really want to use GCC compiler's ability to recognize constraint violating C code, you have to run it with -pedantic-errors switch and, preferably, explicitly select standard language version by using -std= switch.

In your experiment, C compiler performed these implicit conversions for you as a non-standard compiler extension. However, the fact that GCC compiler running under ideone front completely suppressed the corresponding warning messages (issued by the standalone GCC compiler even in its default configuration) means that ideone is a broken C compiler. Its diagnostic output cannot be meaningfully relied upon to tell valid C code from invalid one.

As for the conversion itself... It is not undefined behavior to perform this conversion. But it is undefined behavior to access array data through the converted pointers.

like image 23
AnT Avatar answered Oct 23 '22 11:10

AnT