Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How *(&arr + 1) - arr is working to give the array size [duplicate]

Tags:

c++

arrays

int arr[] = { 3, 5, 9, 2, 8, 10, 11 };      
int arrSize = *(&arr + 1) - arr;
std::cout << arrSize;

I am not able to get how this is working. So anyone can help me with this.

like image 294
Rahul Goswami Avatar asked May 12 '21 19:05

Rahul Goswami


5 Answers

If we "draw" the array together with the pointers, it will look something like this:

+--------+--------+-----+--------+-----+
| arr[0] | arr[1] | ... | arr[6] | ... |
+--------+--------+-----+--------+-----+
^        ^                       ^
|        |                       |
&arr[0]  &arr[1]                 |
|                                |
&arr                             &arr + 1

The type of the expressions &arr and &arr + 1 is int (*)[7]. If we dereference either of those pointers, we get a value of type int[7], and as with all arrays, it will decay to a pointer to its first element.

So what's happening is that we take the difference between a pointer to the first element of &arr + 1 (the dereference really makes this UB, but will still work with any sane compiler) and a pointer to the first element of &arr.

All pointer arithmetic is done in the base-unit of the pointed-to type, which in this case is int, so the result is the number of int elements between the two addresses being pointed at.


It might be useful to know that an array will naturally decay to a pointer to its first element, ie the expression arr will decay to &arr[0], which will have the type int *.

Also, for any pointer (or array) p and index i, the expression *(p + i) is exactly equal to p[i]. So *(&arr + 1) is really the same as (&arr)[1] (which makes the UB much more visible).

like image 199
Some programmer dude Avatar answered Nov 19 '22 02:11

Some programmer dude


That program has undefined behaviour. (&arr + 1) is a valid pointer that points "one beyond" arr, and has type int(*)[7], however it doesn't point to an int [7], so dereferencing it is invalid.

It so happens that your implementation assumes there is a second int [7] after the one you declare, and subtracts the location of the first element of that array that exists from the location of the first element of the fictitious array that the pointer arithmetic invented.

like image 24
Caleth Avatar answered Nov 19 '22 00:11

Caleth


You need to explore what the type of the &arr expression is, and how that affects the + 1 operation on it.

Pointer arithmetic works in 'raw units' of the pointed-to type; &arr is the address of your array, so it points to an object of type, "array of 7 int". Adding 1 to that pointer actually adds the size of the type to the address – so 7 * sizeof(int) is added to the address.

However, in the outer expression (subtraction of arr), the operands are pointers to int objects1 (not arrays), so the 'units' are just sizeof(int) – which is 7 times smaller than in the inner expression. Thus, the subtraction results in the size of the array.


1 This is because, in such expressions, an array variable (such as the second operand, arr) decays to a pointer to its first element; further, your first operand is also an array, as the * operator dereferences the modified value of the array pointer.


Note on Possible UB: Other answers (and comments thereto) have suggested that the dereferencing operation, *(&arr + 1), invokes undefined behaviour. However, looking through this Draft C++17 Standard, there is the vaguest of suggestions that it may not:

6.7.2 Compound Types
...
3    … For purposes of pointer arithmetic (8.5.6) and comparison (8.5.9, 8.5.10), a pointer past the end of the last element of an array x of n elements is considered to be equivalent to a pointer to a hypothetical element x[n].

But I won't claim "Language-Lawyer" status here, as there is no explicit mention in that section about dereferencing such a pointer.

like image 10
Adrian Mole Avatar answered Nov 19 '22 01:11

Adrian Mole


If you have a declaration like this

int arr[] = { 3, 5, 9, 2, 8, 10, 11 };

the the expression &arr + 1 will point to the memory after the last element of the array. The value of the expression is equal to the value of the expression arr + 7 where 7 is the number of elements in the array declared above. The only difference is that the expression &arr + 1 has the type int ( * )[7] while the expression arr + 7 has the type int *.

So due to the integer arithmetic the difference ( arr + 7 ) - arr will yield 7: the number of elements in the array.

On the other hand, dereferencing the expression &att + 1 having the type int ( * )[7] we will get lvalue of the type int[7] that in turn used in the expression *(&arr + 1) - arr is converted to a pointer of the type int * and has the same value as arr + 7 as it was pointed out above. So the expression will yield the number of elements in the array.

The only difference between these two expressions

( arr + 7 ) - arr

and

*( &arr + 1 ) - arr

is that in the first case we will need explicitly to specify the number of elements in the array to get the address of the memory after the last element of the array while in the second case the compiler itself will calculate the address of the memory after the last element of the array knowing the array declaration.

like image 5
Vlad from Moscow Avatar answered Nov 19 '22 02:11

Vlad from Moscow


As others have mentioned, *(&arr + 1) triggers undefined behavior because &arr + 1 is a pointer to one-past-the end of an array of type int [7] and that pointer is subsequently dereferenced.

An alternate way of doing this would be to convert the relevant pointers to uintptr_t, subtracting, and dividing the element size.

int arrSize = reinterpret_cast<int>((reinterpret_cast<uintptr_t>(&arr + 1) -
                                     reinterpret_cast<uintptr_t>(arr)) / sizeof *arr);

Or using C-style casts:

int arrSize = (int)(((uintptr_t)(&arr + 1) - (uintptr_t)arr) / sizeof *arr);
like image 4
dbush Avatar answered Nov 19 '22 00:11

dbush