Is it well defined in c++ to dereference a one-past-the-end pointer to an array type?
Consider the following code :
#include <cassert>
#include <iterator>
int main()
{
// An array of ints
int my_array[] = { 1, 2, 3 };
// Pointer to the array
using array_ptr_t = int(*)[3];
array_ptr_t my_array_ptr = &my_array;
// Pointer one-past-the-end of the array
array_ptr_t my_past_end = my_array_ptr + 1;
// Is this valid?
auto is_this_valid = *my_past_end;
// Seems to yield one-past-the-end of my_array
assert(is_this_valid == std::end(my_array));
}
Common wisdom is that it's undefined behavior to dereference a one-past-the-end pointer. However, does this hold true for pointers to array types?
It seems reasonable that this should be valid since *my_past_end
can be solved purely with pointer arithmetic and yields a pointer to the first element in the array that would be there, which happens to also be a valid one-past-the-end int*
for the original array my_array
.
However, another way of looking at it is that *my_past_end
is producing a reference to an array that doesn't exist, which implicitly converts to an int*
. That reference seems problematic to me.
For context, my question was brought on by this question, specifically the comments to this answer.
Edit : This question is not a duplicate of Take the address of a one-past-the-end array element via subscript: legal by the C++ Standard or not? I'm asking if the rule explained in the question also apply for pointers pointing to an array type.
Edit 2 : Removed auto
to make explicit that my_array_ptr
is not a int*
.
Pointer to an array points to an array, so on dereferencing it, we should get the array, and the name of array denotes the base address. So whenever a pointer to an array is dereferenced, we get the base address of the array to which it points.
You cannot dereference an array, only a pointer. What's happening here is that an expression of array type, in most contexts, is implicitly converted to ("decays" to) a pointer to the first element of the array object. So ar "decays" to &ar[0] ; dereferencing that gives you the value of ar[0] , which is an int .
Dereferencing is used to access or manipulate data contained in memory location pointed to by a pointer. *(asterisk) is used with pointer variable when dereferencing the pointer variable, it refers to variable being pointed, so this is called dereferencing of pointers.
Dereferencing means to get at the data contained at the memory location the pointer is pointing at. This means that one will know what type of data will be read from the memory.
This is CWG 232. That issue might seem like it's mainly about dereferencing a null pointer but it's fundamentally about what it means to simply dereference something that doesn't point to an object. There is no explicit language rule about this case.
One of the examples in the issue is:
Similarly, dereferencing a pointer to the end of an array should be allowed as long as the value is not used:
char a[10]; char *b = &a[10]; // equivalent to "char *b = &*(a+10);"
Both cases come up often enough in real code that they should be allowed.
This is basically the same thing as OP (the a[10]
part of the above expression), except using char
instead of an array type.
Common wisdom is that it's undefined behavior to dereference a one-past-the-end pointer. However, does this hold true for pointers to array types?
There is no difference in the rules based on what kind of pointer it is. my_past_end
is a past-the-end pointer, so whether it's UB to dereference it or not is not a function of the fact that it points to an array as opposed to any other kind of type.
While the type of is_this_valid
an int*
which gets initialized from a int(&)[3]
(array-to-pointer decay), and thus nothing here actually reads from memory - that is immaterial to the way the language rules work. my_past_end
is a pointer whose value is past the end of an object, and that's the only thing that matters.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With