Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does C check if a pointer is out-of-bound without the pointer being dereferenced?

I had this argument with some people saying that C out-of-bound pointers cause undefined behavior even if they're not being dereferenced. example:

int a; int *p = &a; p = p - 1; 

the third line here will cause undefined behavior even if p is never dereferenced (*p is never used).

In my opinion, it sounds illogical that C would check if a pointer is out-of-bound without the pointer being used (it's like someone would inspect people on the street to see if they're carrying guns in case they enter his house. Where the ideal thing to do is to inspect people when they're about to enter the house). I think if C checks for that then a lot of runtime overhead will occur.

Plus, if C really check for OOB pointers then why this won't cause UB:

int *p; // uninitialized thus pointing to a random adress 

in this case why nothing happen even if the chance of p pointing to an OOB adress is high.

ADD:

int a; int *p = &a; p = p - 1; 

say &a is 1000. Will the value of p after evaluating the third line be:

  • 996 but still undefined behavior because p could be dereferenced somewhere else and cause the real problem.
  • undefined value and that's the undefined behavior.

because I think that "the third line was called to be of undefined behavior" in the first place was because of the potential future use of that OOB pointer (dereferencing) and people, over time, took it as an undefined behavior in it's own. Now, is the value of p will be 100% 996 and that still undefined behavior or its value will be undefined?

like image 612
ibrahim mahrir Avatar asked Jan 16 '17 00:01

ibrahim mahrir


People also ask

What does Dereferenced mean in C?

To go to an address before performing the operation. For example, in C programming, a dereferenced variable is a pointer to the variable, not the variable itself.

What is an out of bounds pointer?

The pointer dst goes out of bounds when it is computed at the end of the last iteration, and it is never used after that. Besides, it may look like this last value of dst is a one-past-the-end pointer as allowed by the C standard, but that is only true if dst started from 0.

What does dereferencing a pointer mean in C?

Dereferencing is used to access or manipulate data contained in memory location pointed to by a pointer. *(asterisk) is used with pointer variable when dereferencing the pointer variable, it refers to variable being pointed, so this is called dereferencing of pointers.

When you dereference a pointer to a pointer the result is?

Dereferencing a pointer means getting the value that is stored in the memory location pointed by the pointer. The operator * is used to do this, and is called the dereferencing operator.


1 Answers

C does not check if a pointer is out of bounds. But the underlying hardware might behave in strange ways when an address is computed that falls outside the object boundaries, pointing just after the end of an object being an exception. The C Standard explicitly describes this as causing undefined behavior.

For most current environments, the above code does not pose a problem, but similar situations could cause segmentation faults in x86 16-bit protected mode, some 25 years ago.

In the language of the Standard, such a value could be a trap value, something that cannot be manipulated without invoking undefined behavior.

The pertinent section of the C11 Standard is:

6.5.6 Additive operators

  1. When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. [...] If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

A similar example of undefined behavior is this:

char *p; char *q = p; 

Merely loading the value of uninitialized pointer p invokes undefined behavior, even if it is never dereferenced.

EDIT: it is a moot point trying to argue about this. The Standard says computing such an address invokes undefined behavior, so it does. The fact that some implementations might just compute some value and store it or not is irrelevant. Do not rely on any assumptions regarding undefined behavior: the compiler might take advantage of its inherently unpredictable nature to perform optimizations that you cannot imagine.

For example this loop:

for (int i = 1; i != 0; i++) {     ... } 

might compile to an infinite loop without any test at all: i++ invokes undefined behavior if i is INT_MAX, so the compiler's analysis is this:

  • initial value of i is > 0.
  • for any positive value of i < INT_MAX, i++ is still > 0
  • for i = INT_MAX, i++ invokes undefined behavior, so we can assume i > 0 because we can assume anything we please.

Therefore i is always > 0 and the test code can be removed.

like image 87
chqrlie Avatar answered Sep 23 '22 17:09

chqrlie