Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pointer arithmetic around cast

I am currently enrolled in a CS107 class which makes the following assumptions:

  • sizeof(int) == 4
  • sizeof(short) == 2
  • sizeof(char) == 1
  • big endianness

My professor showed the following code:

int arr[5];
((short*)(((char*) (&arr[1])) + 8))[3] = 100;

Here are the 20 bytes representing arr:

|....|....|....|....|....|

My professor states that &arr[1] points here, which I agree with.

|....|....|....|....|....|
     x

I now understand that (char*) makes the pointer the width of a char (1 byte) instead of the width of an int (4 bytes).

What I don't understand is the + 8, which my professor says points here:

|....|....|....|....|....|
                         x

But shouldn't it point here, since it is going forwards 8 times the size of a char (1 byte)?

|....|....|....|....|....|
               x
like image 727
Alexey Avatar asked Feb 17 '15 17:02

Alexey


People also ask

What is pointer arithmetic example?

For Example:Two integer pointers say ptr1(address:1000) and ptr2(address:1016) are subtracted. The difference between address is 16 bytes. Since the size of int is 2 bytes, therefore the increment between ptr1 and ptr2 is given by (16/2) = 8.

Is arithmetic operations possible with pointer?

You can perform a limited number of arithmetic operations on pointers. These operations are: Increment and decrement. Addition and subtraction.

Which pointer is arithmetic allowed?

We can perform arithmetic operations on the pointers like addition, subtraction, etc. However, as we know that pointer contains the address, the result of an arithmetic operation performed on the pointer will also be a pointer if the other operand is of type integer.

What happens when you typecast a pointer?

Typecasting change a variable into a different type just for that operation. Pointer type determines the size of the data it points to. In other words, when you do pointer arithemetic (i.e +), the number of bytes change (i.e increase) in terms of memory address is determined by the pointer type.


2 Answers

Let's take it step by step. Your expression can be decomposed like this:

((short*)(((char*) (&arr[1])) + 8))[3]
-----------------------------------------------------
char *base = (char *) &arr[1];
char *base_plus_offset = base + 8;
short *cast_into_short = (short *) base_plus_offset;
cast_into_short[3] = 100;

base_plus_offset points at byte location 12 within the array. cast_into_short[3] refers to a short value at location 12 + sizeof(short) * 3, which, in your case is 18.

like image 189
Blagovest Buyukliev Avatar answered Oct 12 '22 17:10

Blagovest Buyukliev


The expression will set the two bytes 18 bytes after the start of arr to the value 100.

#include <stdio.h>

int main() {

    int arr[5];

    char* start=(char*)&arr;
    char* end=(char*)&((short*)(((char*) (&arr[1])) + 8))[3];

    printf("sizeof(int)=%zu\n",sizeof(int));
    printf("sizeof(short)=%zu\n",sizeof(short));
    printf("offset=%td <- THIS IS THE ANSWER\n",(end-start));
    printf("100=%04x (hex)\n",100);

    for(size_t i=0;i<5;++i){

       printf("arr[%zu]=%d (%08x hex)\n",i,arr[i],arr[i]);

    }

}

Possible Output:

sizeof(int)=4
sizeof(short)=2
offset=18 <- THIS IS THE ANSWER
100=0064 (hex)
arr[0]=0 (00000000 hex)
arr[1]=0 (00000000 hex)
arr[2]=0 (00000000 hex)
arr[3]=0 (00000000 hex)
arr[4]=6553600 (00640000 hex)

In all your professors shenanigans he's shifted you 1 integer, 8 chars/bytes and 3 shorts that 4+8+6=18 bytes. Bingo.

Notice this output reveals the machine I ran this on to have 4 byte integers, 2 byte short (common) and be little-endian because the last two bytes of the array were set to 0x64 and 0x00 respectively.

I find your diagrams dreadfully confusing because it isn't very clear if you mean the '|' to be addresses or not.

|....|....|....|....|
012345678901234567890
    ^     1 ^     ^ 2
A   X       C     S B

Include the bars ('|') A is the start of Arr and B is 'one past the end' (a legal concept in C).

X is the address referred to by the expression &Arr[1]. C by the expression (((char*) (&arr[1])) + 8). S by the whole expression. S and the byte following are assigned to and what that means depends on the endian-ness of your platform.

I leave it as an exercise to determine what the output on a similar but big-endian platform who output. Anyone? I notice from the comments you're big-endian and I'm little-endian (stop sniggering). You only need to change one line of the output.

like image 43
Persixty Avatar answered Oct 12 '22 17:10

Persixty