Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is out-of-bounds pointer arithmetic undefined behaviour?

The following example is from Wikipedia.

int arr[4] = {0, 1, 2, 3}; int* p = arr + 5;  // undefined behavior 

If I never dereference p, then why is arr + 5 alone undefined behaviour? I expect pointers to behave as integers - with the exception that when dereferenced the value of a pointer is considered as a memory address.

like image 457
NFRCR Avatar asked May 06 '12 19:05

NFRCR


People also ask

What is an out of bounds pointer?

The pointer dst goes out of bounds when it is computed at the end of the last iteration, and it is never used after that. Besides, it may look like this last value of dst is a one-past-the-end pointer as allowed by the C standard, but that is only true if dst started from 0.

What is illegal pointer arithmetic C?

Illegal arithmetic with pointers There are various operations which can not be performed on pointers. Since, pointer stores address hence we must ignore the operations which may lead to an illegal address, for example, addition, and multiplication. A list of such operations is given below. Address + Address = illegal.


2 Answers

That's because pointers don't behave like integers. It's undefined behavior because the standard says so.

On most platforms however (if not all), you won't get a crash or run into dubious behavior if you don't dereference the array. But then, if you don't dereference it, what's the point of doing the addition?

That said, note that an expression going one over the end of an array is technically 100% "correct" and guaranteed not to crash per §5.7 ¶5 of the C++11 spec. However, the result of that expression is unspecified (just guaranteed not to be an overflow); while any other expression going more than one past the array bounds is explicitly undefined behavior.

Note: That does not mean it is safe to read and write from an over-by-one offset. You likely will be editing data that does not belong to that array, and will cause state/memory corruption. You just won't cause an overflow exception.

My guess is that it's like that because it's not only dereferencing that's wrong. Also pointer arithmetics, comparing pointers, etc. So it's just easier to say don't do this instead of enumerating the situations where it can be dangerous.

like image 163
Luchian Grigore Avatar answered Sep 24 '22 13:09

Luchian Grigore


The original x86 can have issues with such statements. On 16 bits code, pointers are 16+16 bits. If you add an offset to the lower 16 bits, you might need to deal with overflow and change the upper 16 bits. That was a slow operation and best avoided.

On those systems, array_base+offset was guaranteed not to overflow, if offset was in range (<=array size). But array+5 would overflow if array contained only 3 elements.

The consequence of that overflow is that you got a pointer which doesn't point behind the array, but before. And that might not even be RAM, but memory-mapped hardware. The C++ standard doesn't try to limit what happens if you construct pointers to random hardware components, i.e. it's Undefined Behavior on real systems.

like image 27
MSalters Avatar answered Sep 21 '22 13:09

MSalters