Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++: Semantics of [x] for dereferencing: C-style vs pointer array

Tags:

c++

arrays

I know that a c style array is stored as a contiguous block of memory. That is why the following code:

int main (int argc, char *argv[]) {
    int arr[3][3];
    *(*arr + 5) = 5;
    std::cout << arr[1][2] << std::endl;
    return 0;
}

prints 5. I assume that for c style arrays *(*arr + 5) = 5; is roughly equals to the code which the compiler produces for arr[1][2] = 5; isn't it? (Q1)

If so, then the semantics of arr[1][2] (i.e. shifting on one block of memory) is totally different from doing the same on a multidimensional pointer array where every level of nesting results in a pointer being dereferenced. Is that right? (Q2)

Is there any situation where I need to pay attention to that myself? I.e. where the compiler does not know himself what kind of array he is dealing with? (Q3)

(Qx marks my questions)

Thank you in advance and Regards

like image 897
ben Avatar asked Dec 22 '13 12:12

ben


1 Answers

On one hand, you're talking about a two dimensional array, which you can imagine looks something like this in memory:

 0,0 0,1 0,2 1,0 1,1 1,2 2,0 2,1 2,2
┌───┬───┬───┬───┬───┬───┬───┬───┬───┐
│int│int│int│int│int│int│int│int│int│
└───┴───┴───┴───┴───┴───┴───┴───┴───┘

On the other hand, when you have an array of pointers to arrays, it looks like this:

┌────┬────┬────┐
│int*│int*│int*┿━━━━━━━━━━━━━━┓
└─╂──┴─╂──┴────┘              ┃
  ┃    ┗━━━━━━━━┓             ┃
  ▼             ▼             ▼
┌───┬───┬───┐ ┌───┬───┬───┐ ┌───┬───┬───┐
│int│int│int│ │int│int│int│ │int│int│int│
└───┴───┴───┘ └───┴───┴───┘ └───┴───┴───┘ 
 0,0 0,1 0,2   1,0 1,1 1,2   2,0 2,1 2,2

Both of these can be indexed using the same [i][j] syntax. The [] operator is defined as x[i] being equivalent to *((x) + (i)). If we index x[1][1], we have *((*((x) + (1)) + (1)).

In the first case above, first the array name x undergoes array-to-pointer conversion to get a pointer to its first element (which is itself an array, so we have a int (*)[3]), then we add 1 to it to move along to the next subarray and dereference it. This subarray then also undergoes array-to-pointer conversion to get a pointer to its first element, which we add 1 to again and dereference. So what we end up with the end is the 2nd element in the 2nd subarray.

In the second case, we are dealing with an array of pointers. First the array name x undergoes array-to-pointer conversion to get a pointer to its first element (which is a int*). Then we add 1 to it to move to the next pointer in the array and it is dereferenced to get the int* itself. Then we add 1 to this int* to move along to the next int after the one it is currently pointing at and we dereference that. That again gives us the 2nd element in the 2nd array of ints.

So, given all that:

  1. Yes, since the elements of the 2D array are contiguous, you can do pointer arithmetic where you treat it as a 1D array with 9 ints in it.

  2. Yes, the memory layout of the data in each case is different, so the operations that occur when indexing into them are different.

  3. The compiler always knows which type it is dealing with. The compiler won't let you, for example, attempt to convert a 2D array to a int**. They are simply incompatible. However, you generally need to make sure you know what the memory layout of your data is.


Sometimes you have the following kind of layout where you have an int**, particularly common when you dynamically allocate an array of pointers that point to dynamically allocated arrays:

┌─────┐
│int**│
└─╂───┘
  ▼
┌────┬────┬────┐
│int*│int*│int*┿━━━━━━━━━━━━━━┓
└─╂──┴─╂──┴────┘              ┃
  ┃    ┗━━━━━━━━┓             ┃
  ▼             ▼             ▼
┌───┬───┬───┐ ┌───┬───┬───┐ ┌───┬───┬───┐
│int│int│int│ │int│int│int│ │int│int│int│
└───┴───┴───┘ └───┴───┴───┘ └───┴───┴───┘ 
 0,0 0,1 0,2   1,0 1,1 1,2   2,0 2,1 2,2

The process of indexing this layout is almost exactly the same as the second case above. The only difference is that the first array-to-pointer conversion is not necessary because we already have a pointer.

like image 159
Joseph Mansfield Avatar answered Sep 30 '22 02:09

Joseph Mansfield