Even though the C Standard explicitly recognizes for the possibility that an address which points "just past" one object might by happenstance compare equal to one which points "to" another unrelated object, both gcc and clang seem to operate on the assumption that no pointer that is observed to point just past one object can possibly point to another, as evidenced by the example: <pre class="prettyprint"><code>#include <stdio.h> int x[1],y[1]; int test1(int *p) { y[0] = 1; if (p==x+1) *p = 2; // Note that assignment is to *p and not to x[1] !!! return y[0]; } int test2(int *p) { x[0] = 1; if (p==y+1) *p = 2; // Note that assignment is to *p and not to y[1] !!! return x[0]; } int (*volatile test1a)(int *p) = test1; int (*volatile test2a)(int *p) = test2; int main(void) { int q; printf("%llX\n",(unsigned long long)y - (unsigned long long)x); q = test1a(y); printf(">> %d %d\n", y[0], q); q = test2a(x); printf(">> %d %d\n", x[0], q); return 0; } </code></pre> By my reading of the Standard, valid outputs from this program on the lines marked <code>>></code> would be either <code>>> 1 1</code> or <code>>> 2 2</code>, but gcc on ideone outputs <code>>> 2 1</code> for one of the lines and from what I can tell the code generated by clang would do likewise for the other. I am well aware that the fact that <code>p</code> compares equal to <code>x[1]</code> would not imply that the latter expression could be used to access the same object as <code>*p</code> (or any object at all, for that matter), but I am aware of nothing in the Standard that would forbid the computation of <code>x+1</code> nor comparison between the resulting pointer and <code>p</code>. I am also aware of nothing that would cause such comparison to make <code>p</code> unusable for accessing the object whose address it holds. Is there any plausible reading of any published or draft version of the C Standard under which the above code invokes Undefined Behavior, or under which the returned values from <code>test1</code> and <code>test2</code> would not be required to match the final values of <code>y[0]</code> or <code>x[0]</code>, respectively, or are the optimizers in clang and gcc designed to process a dialect that isn't a published or draft version of the Standard? PS--From the standard draft N1570 6.5.9p6: <blockquote> 6 Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space. </blockquote> The Standard does not in any way imply that <code>x[]</code> must follow <code>y[]</code> nor vice versa, but seems to be explicitly providing for the possibility that a pointer that points just past <code>x</code> might be compared to <code>y</code> and observed to be equal.

This is indeed a compiler bug. In most cases, when <code>p == x+i</code> evaluates to true for some pointer <code>p</code>, array <code>x</code>, and integer <code>i</code>, it is true that <code>p</code> necessarily points to an element in <code>x</code> (or the program is executing code with undefined behavior, in which case the compiler is permitted to assume <code>p</code> points to an element in <code>x</code>). If <code>p</code> did point to an element in <code>x</code>, it would be true that <code>*p = 2;</code> cannot change <code>y[0]</code>, and therefore it would be correct for the compiler to generate code that returns the 1 that was recently assigned to <code>y[0]</code>. The clues suggest the compiler has attempted an optimization in which it “learns” about <code>p</code> through the truth of the comparison. This deduction fails when <code>x+i</code> points to one beyond the end of the array <code>x</code>. In C 2018 6.5.9 6, the standard tells us that this comparison may evaluate to true even though <code>p</code> points to a different object, unrelated to <code>x</code> (but that happens to immediately follow it in memory). The compiler’s deduction ought to be more limited. Given <code>p+j == x+i</code>, where <code>x</code> is an array of <code>n</code> elements, the fact the evaluation is true only implies that <code>p+k</code> points to an element of <code>x</code> for <code>j</code>−<code>i</code> ≤ <code>k</code> < <code>n</code>+<code>j</code>−<code>i</code>. (In the case in the question, <code>i</code>=1, <code>j</code>=0, and <code>k</code>=0, which fails 0−1 ≤ 0 < 1+0−1.) (Note that the criterion presented above implies, for the case in the question, that <code>p-1</code> points to an element of <code>x</code>, since 0−1 ≤ −1 < 0 holds, yet we know <code>p</code> points into <code>y</code> and is not valid for pointing into <code>x</code>. But using it to access <code>p[-1]</code> has behavior not defined by the standard, meaning any assumptions are permitted, including that <code>p[-1]</code> is an element of <code>x</code>. So the compiler is permitted to use the criterion.)

Is this gcc/clang past-one pointer comparison behavior conforming or non-standard?

Tags:

c

optimization

gcc

clang

Even though the C Standard explicitly recognizes for the possibility that an address which points "just past" one object might by happenstance compare equal to one which points "to" another unrelated object, both gcc and clang seem to operate on the assumption that no pointer that is observed to point just past one object can possibly point to another, as evidenced by the example:

#include <stdio.h>

int x[1],y[1];
int test1(int *p)
{
    y[0] = 1;
    if (p==x+1)
        *p = 2; // Note that assignment is to *p and not to x[1] !!!
    return y[0];
}
int test2(int *p)
{
    x[0] = 1;
    if (p==y+1)
        *p = 2; // Note that assignment is to *p and not to y[1] !!!
    return x[0];
}

int (*volatile test1a)(int *p) = test1;
int (*volatile test2a)(int *p) = test2;

int main(void) {
    int q;
    printf("%llX\n",(unsigned long long)y - (unsigned long long)x);
    q = test1a(y);
    printf(">> %d %d\n", y[0], q);
    q = test2a(x);
    printf(">> %d %d\n", x[0], q);
    return 0;
}

By my reading of the Standard, valid outputs from this program on the lines marked >> would be either >> 1 1 or >> 2 2, but gcc on ideone outputs >> 2 1 for one of the lines and from what I can tell the code generated by clang would do likewise for the other.

I am well aware that the fact that p compares equal to x[1] would not imply that the latter expression could be used to access the same object as *p (or any object at all, for that matter), but I am aware of nothing in the Standard that would forbid the computation of x+1 nor comparison between the resulting pointer and p. I am also aware of nothing that would cause such comparison to make p unusable for accessing the object whose address it holds.

Is there any plausible reading of any published or draft version of the C Standard under which the above code invokes Undefined Behavior, or under which the returned values from test1 and test2 would not be required to match the final values of y[0] or x[0], respectively, or are the optimizers in clang and gcc designed to process a dialect that isn't a published or draft version of the Standard?

PS--From the standard draft N1570 6.5.9p6:

6 Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.

The Standard does not in any way imply that x[] must follow y[] nor vice versa, but seems to be explicitly providing for the possibility that a pointer that points just past x might be compared to y and observed to be equal.

669

asked Apr 18 '19 22:04

supercat

1 Answers

This is indeed a compiler bug. In most cases, when p == x+i evaluates to true for some pointer p, array x, and integer i, it is true that p necessarily points to an element in x (or the program is executing code with undefined behavior, in which case the compiler is permitted to assume p points to an element in x). If p did point to an element in x, it would be true that *p = 2; cannot change y[0], and therefore it would be correct for the compiler to generate code that returns the 1 that was recently assigned to y[0]. The clues suggest the compiler has attempted an optimization in which it “learns” about p through the truth of the comparison.

This deduction fails when x+i points to one beyond the end of the array x. In C 2018 6.5.9 6, the standard tells us that this comparison may evaluate to true even though p points to a different object, unrelated to x (but that happens to immediately follow it in memory). The compiler’s deduction ought to be more limited. Given p+j == x+i, where x is an array of n elements, the fact the evaluation is true only implies that p+k points to an element of x for j−i ≤ k < n+j−i. (In the case in the question, i=1, j=0, and k=0, which fails 0−1 ≤ 0 < 1+0−1.)

(Note that the criterion presented above implies, for the case in the question, that p-1 points to an element of x, since 0−1 ≤ −1 < 0 holds, yet we know p points into y and is not valid for pointing into x. But using it to access p[-1] has behavior not defined by the standard, meaning any assumptions are permitted, including that p[-1] is an element of x. So the compiler is permitted to use the criterion.)

156

answered Oct 20 '22 01:10

Eric Postpischil

Related questions
                            
                                sctp_connectx() gives EINVAL on FreeBSD
                            
                                Is there an equivalent to the GNU linker "--just-symbols" option for non-GNU linkers?
                            
                                Reorder function in c file based on c header file
                            
                                How to use kgdb on ARM??
                            
                                Different read and write count using cachegrind and callgrind
                            
                                library to work with historical (big) dates and time (eg, 11,043 BC)?
                            
                                Which is the best way to suppress "unused variable" warning
                            
                                Reading from USB device and writing to physical address
                            
                                fflush, fsync and sync vs memory layers
                            
                                What is this? getproccount
                            
                                Why are particular UDP messages always getting dropped below a particular buffer size?
                            
                                Extracting preprocessor symbols from source
                            
                                Why is this addition being silently ignored?
                            
                                error in C using malloc : corrupted size vs prev_size
                            
                                Run a portable executable in memory - WinApi
                            
                                How to produce beep sound using "\a" escape character?
                            
                                Beating or meeting OS X memset (and memset_pattern4)
                            
                                Cython VS C++ Performance Comparison? [closed]
                            
                                gcc consumes all memory when optimizing -O3
                            
                                What replacements are available for formerly-widely-supported behaviors not defined by C standard

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With