Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the rules for casting pointers in C?

K&R doesn't go over it, but they use it. I tried seeing how it'd work by writing an example program, but it didn't go so well:

#include <stdio.h>  int bleh (int *);   int main(){     char c = '5';      char *d = &c;      bleh((int *)d);      return 0;   }  int bleh(int *n){     printf("%d bleh\n", *n);      return *n;  } 

It compiles, but my print statement spits out garbage variables (they're different every time I call the program). Any ideas?

like image 434
Theo Chronic Avatar asked Jun 23 '13 11:06

Theo Chronic


People also ask

What is pointer type casting in C?

In the C language, casting is a construct to view a data object temporarily as another data type. When you cast pointers, especially for non-data object pointers, consider the following characteristics and constraints: You can cast a pointer to another pointer of the same IBM® i pointer type.

Can pointer be Typecasted?

Pointer is merely a memory address. With typecasting, any type with enough size to hold the memory address can work like a pointer.

Can you cast a pointer to an integer in C?

Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined.

How do casts work in C?

Type Casting is basically a process in C in which we change a variable belonging to one data type to another one. In type casting, the compiler automatically changes one data type to another one depending on what we want the program to do.


1 Answers

When thinking about pointers, it helps to draw diagrams. A pointer is an arrow that points to an address in memory, with a label indicating the type of the value. The address indicates where to look and the type indicates what to take. Casting the pointer changes the label on the arrow but not where the arrow points.

d in main is a pointer to c which is of type char. A char is one byte of memory, so when d is dereferenced, you get the value in that one byte of memory. In the diagram below, each cell represents one byte.

-+----+----+----+----+----+----+-  |    | c  |    |    |    |    |  -+----+----+----+----+----+----+-        ^~~~        | char        d 

When you cast d to int*, you're saying that d really points to an int value. On most systems today, an int occupies 4 bytes.

-+----+----+----+----+----+----+-  |    | c  | ?₁ | ?₂ | ?₃ |    |  -+----+----+----+----+----+----+-        ^~~~~~~~~~~~~~~~~~~        | int        (int*)d 

When you dereference (int*)d, you get a value that is determined from these four bytes of memory. The value you get depends on what is in these cells marked ?, and on how an int is represented in memory.

A PC is little-endian, which means that the value of an int is calculated this way (assuming that it spans 4 bytes): * ((int*)d) == c + ?₁ * 2⁸ + ?₂ * 2¹⁶ + ?₃ * 2²⁴. So you'll see that while the value is garbage, if you print in in hexadecimal (printf("%x\n", *n)), the last two digits will always be 35 (that's the value of the character '5').

Some other systems are big-endian and arrange the bytes in the other direction: * ((int*)d) == c * 2²⁴ + ?₁ * 2¹⁶ + ?₂ * 2⁸ + ?₃. On these systems, you'd find that the value always starts with 35 when printed in hexadecimal. Some systems have a size of int that's different from 4 bytes. A rare few systems arrange int in different ways but you're extremely unlikely to encounter them.

Depending on your compiler and operating system, you may find that the value is different every time you run the program, or that it's always the same but changes when you make even minor tweaks to the source code.

On some systems, an int value must be stored in an address that's a multiple of 4 (or 2, or 8). This is called an alignment requirement. Depending on whether the address of c happens to be properly aligned or not, the program may crash.

In contrast with your program, here's what happens when you have an int value and take a pointer to it.

int x = 42; int *p = &x; 
-+----+----+----+----+----+----+-  |    |         x         |    |  -+----+----+----+----+----+----+-        ^~~~~~~~~~~~~~~~~~~        | int        p 

The pointer p points to an int value. The label on the arrow correctly describes what's in the memory cell, so there are no surprises when dereferencing it.

like image 138
Gilles 'SO- stop being evil' Avatar answered Sep 21 '22 10:09

Gilles 'SO- stop being evil'