Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is this gcc/clang past-one pointer comparison behavior conforming or non-standard?

Even though the C Standard explicitly recognizes for the possibility that an address which points "just past" one object might by happenstance compare equal to one which points "to" another unrelated object, both gcc and clang seem to operate on the assumption that no pointer that is observed to point just past one object can possibly point to another, as evidenced by the example:

#include <stdio.h>

int x[1],y[1];
int test1(int *p)
{
    y[0] = 1;
    if (p==x+1)
        *p = 2; // Note that assignment is to *p and not to x[1] !!!
    return y[0];
}
int test2(int *p)
{
    x[0] = 1;
    if (p==y+1)
        *p = 2; // Note that assignment is to *p and not to y[1] !!!
    return x[0];
}

int (*volatile test1a)(int *p) = test1;
int (*volatile test2a)(int *p) = test2;

int main(void) {
    int q;
    printf("%llX\n",(unsigned long long)y - (unsigned long long)x);
    q = test1a(y);
    printf(">> %d %d\n", y[0], q);
    q = test2a(x);
    printf(">> %d %d\n", x[0], q);
    return 0;
}

By my reading of the Standard, valid outputs from this program on the lines marked >> would be either >> 1 1 or >> 2 2, but gcc on ideone outputs >> 2 1 for one of the lines and from what I can tell the code generated by clang would do likewise for the other.

I am well aware that the fact that p compares equal to x[1] would not imply that the latter expression could be used to access the same object as *p (or any object at all, for that matter), but I am aware of nothing in the Standard that would forbid the computation of x+1 nor comparison between the resulting pointer and p. I am also aware of nothing that would cause such comparison to make p unusable for accessing the object whose address it holds.

Is there any plausible reading of any published or draft version of the C Standard under which the above code invokes Undefined Behavior, or under which the returned values from test1 and test2 would not be required to match the final values of y[0] or x[0], respectively, or are the optimizers in clang and gcc designed to process a dialect that isn't a published or draft version of the Standard?

PS--From the standard draft N1570 6.5.9p6:

6 Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.

The Standard does not in any way imply that x[] must follow y[] nor vice versa, but seems to be explicitly providing for the possibility that a pointer that points just past x might be compared to y and observed to be equal.

like image 669
supercat Avatar asked Apr 18 '19 22:04

supercat


People also ask

Is it possible to compare two pointers Why?

In Go language, you are allowed to compare two pointers with each other. Two pointers values are only equal when they point to the same value in the memory or if they are nil. You can perform a comparison on pointers with the help of == and !=

Can we compare two pointers in C?

We can compare pointers if they are pointing to the same array. Relational pointers can be used to compare two pointers. Pointers can't be multiplied or divided.


1 Answers

This is indeed a compiler bug. In most cases, when p == x+i evaluates to true for some pointer p, array x, and integer i, it is true that p necessarily points to an element in x (or the program is executing code with undefined behavior, in which case the compiler is permitted to assume p points to an element in x). If p did point to an element in x, it would be true that *p = 2; cannot change y[0], and therefore it would be correct for the compiler to generate code that returns the 1 that was recently assigned to y[0]. The clues suggest the compiler has attempted an optimization in which it “learns” about p through the truth of the comparison.

This deduction fails when x+i points to one beyond the end of the array x. In C 2018 6.5.9 6, the standard tells us that this comparison may evaluate to true even though p points to a different object, unrelated to x (but that happens to immediately follow it in memory). The compiler’s deduction ought to be more limited. Given p+j == x+i, where x is an array of n elements, the fact the evaluation is true only implies that p+k points to an element of x for jik < n+ji. (In the case in the question, i=1, j=0, and k=0, which fails 0−1 ≤ 0 < 1+0−1.)

(Note that the criterion presented above implies, for the case in the question, that p-1 points to an element of x, since 0−1 ≤ −1 < 0 holds, yet we know p points into y and is not valid for pointing into x. But using it to access p[-1] has behavior not defined by the standard, meaning any assumptions are permitted, including that p[-1] is an element of x. So the compiler is permitted to use the criterion.)

like image 156
Eric Postpischil Avatar answered Oct 20 '22 01:10

Eric Postpischil