Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When is casting between pointer types not undefined behavior in C?

As a newcomer to C, I'm confused about when casting a pointer is actually OK.

As I understand, you can pretty much cast any pointer type to any other type, and the compiler will let you do it. For example:

int a = 5;
int* intPtr = &a;
char* charPtr = (char*) intPtr; 

However, in general this invokes undefined behavior (though it happens to work on many platforms). This said, there seem to be some exceptions:

  • you can cast to and from void* freely (?)
  • you can cast to and from char* freely (?)

(at least I've seen it in code...).

So which casts between pointer types are not undefined behaviour in C?

Edit:

I tried looking into the C standard (section "6.3.2.3 Pointers", at http://c0x.coding-guidelines.com/6.3.2.3.html ), but didn't really understand it, apart from the bit about void*.

Edit2:

Just for clarification: I'm explicitly only asking about "normal" pointers, i.e. not about function pointers. I realize that the rules for casting function pointers are very restrictive. As I matter of fact, I've already asked about that :-): What happens if I cast a function pointer, changing the number of parameters

like image 706
sleske Avatar asked Jan 26 '11 21:01

sleske


People also ask

What is the rule for casting pointers in C?

There are no rules on casting pointers in C! The language lets you cast any pointer to any other pointer without comment.

What is undefined behavior C?

When we run a code, sometimes we see absurd results instead of expected output. So, in C/C++ programming, undefined behavior means when the program fails to compile, or it may execute incorrectly, either crashes or generates incorrect results, or when it may fortuitously do exactly what the programmer intended.

What is pointer type casting in C?

In the C language, casting is a construct to view a data object temporarily as another data type. When you cast pointers, especially for non-data object pointers, consider the following characteristics and constraints: You can cast a pointer to another pointer of the same IBM® i pointer type.

Can pointer be Typecasted?

Pointer is merely a memory address. With typecasting, any type with enough size to hold the memory address can work like a pointer.


3 Answers

Basically:

  • a T * may be freely converted to a void * and back again (where T * is not a function pointer), and you will get the original pointer.
  • a T * may be freely converted to a U * and back again (where T * and U * are not function pointers), and you will get the original pointer if the alignment requirements are the same. If not, the behaviour is undefined.
  • a function-pointer may be freely converted to any other function-pointer type and back again, and you will get the original pointer.

Note: T * (for non-function-pointers) always satisfies the alignment requirements for char *.

Important: None of these rules says anything about what happens if you convert, say, a T * to a U * and then try to dereference it. That's a whole different area of the standard.

like image 72
Oliver Charlesworth Avatar answered Sep 20 '22 03:09

Oliver Charlesworth


Oli Charlesworth's excellent answer lists all cases where casting a pointer to a pointer of a different type gives a well-defined result.

In addition, there are four cases where casting a pointer gives implementation-defined results:

  • You can cast a pointer to an sufficiently large (!) integer type. C99 has the optional types intptr_t and uintptr_t for this purpose. The result is implementation-defined. On platforms that address memory as a contiguous stream of bytes ("linear memory model", used by most modern platforms), it usually returns the numeric value of the memory address the pointer points to, thus simply a byte count. However, not all platforms use a linear memory model, which is why this is implementation-defined :-).
  • Conversely, you can cast an integer to a pointer. If the integer has a type large enough for intptr_t or uintptr_t and was created by casting a pointer, casting it back to the same pointer type will give you back that pointer (which however may no longer be valid). Otherwise the result is implementation-defined. Note that actually dereferencing the pointer (as opposed to just reading its value) may still be UB.
  • You can cast a pointer to any object to char*. Then the result points to the lowest addressed byte of the object, and you can read the remaining bytes of the object by incrementing the pointer, up to the object's size. Of course, which values you actually get is again implementation-defined...
  • You can freely cast null pointers, they'll always stay null pointers regardless of pointer type :-).

Source: C99 standard, sections 6.3.2.3 "Pointers", and 7.18.1.4 "Integer types capable of holding object pointers".

As far as I can tell, all other casts of a pointer to a pointer of a different type are undefined behavior. In particular, if you are not casting to char or a sufficiently large integer type, it may always be UB to cast a pointer to a different pointer type - even without dereferencing it.

This is because the types may have different alignment, and there is no general, portable way to make sure different types have compatible alignment (except for some special cases, such as signed/unsigned integer type pairs).

like image 21
sleske Avatar answered Sep 21 '22 03:09

sleske


Generally, if as usual nowadays the pointers themselves have the same alignment properties, the problem is not the cast itself, but whether or not you may access the data through the pointer.

Casting any type T* to void* and back is guaranteed for any object type T: this is guaranteed to give you exactly the same pointer back. void* is the catch all object pointer type.

For other casts between object types there is no guarantee, accessing an object through such a pointer may cause all sorts of problems, such as alignments (bus error), trap representations of integers. Different pointer types are not even guaranteed to have the same width, so theoretically you might even loose information.

One cast that should always work, though, is to (unsigned char*). Through such a pointer you may then investigate the individual bytes of your object.

like image 24
Jens Gustedt Avatar answered Sep 19 '22 03:09

Jens Gustedt