Rationale for pointer comparisons outside an array to be UB

Tags:

So, the standard (referring to N1570) says the following about comparing pointers:

C99 6.5.8/5 Relational operators

When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. ... [snip obvious definitions of comparison within aggregates] ... In all other cases, the behavior is undefined.

What is the rationale for this instance of UB, as opposed to specifying (for instance) conversion to intptr_t and comparison of that?

Is there some machine architecture where a sensible total ordering on pointers is hard to construct? Is there some class of optimization or analysis that unrestricted pointer comparisons would impede?

A deleted answer to this question mentions that this piece of UB allows for skipping comparison of segment registers and only comparing offsets. Is that particularly valuable to preserve?

(That same deleted answer, as well as one here, note that in C++, std::less and the like are required to implement a total order on pointers, whether the normal comparison operator does or not.)

415

asked Jul 01 '15 01:07

Phil Miller

2 Answers

Various comments in the ub mailing list discussion Justification for < not being a total order on pointers? strongly allude to segmented architectures being the reason. Including the follow comments, 1:

Separately, I believe that the Core Language should simply recognize the fact that all machines these days have a flat memory model.

and 2:

Then we maybe need an new type that guarantees a total order when converted from a pointer (e.g. in segmented architectures, conversion would require taking the address of the segment register and adding the offset stored in the pointer).

and 3:

Pointers, while historically not totally ordered, are practically so for all systems in existence today, with the exception of the ivory tower minds of the committee, so the point is moot.

and 4:

But, even if segmented architectures, unlikely though it is, do come back, the ordering problem still has to be addressed, as std::less is required to totally order pointers. I just want operator< to be an alternate spelling for that property.

Why should everyone else pretend to suffer (and I do mean pretend, because outside of a small contingent of the committee, people already assume that pointers are totally ordered with respect to operator<) to meet the theoretical needs of some currently non-existent architecture?

Counter to the trend of comments from the ub mailing list, FUZxxl points out that supporting DOS is a reason not to support totally ordered pointers.

Update

This is also supported by the Annotated C++ Reference Manual(ARM) which says this was due to burden of supporting this on segmented architectures:

The expression may not evaluate to false on segmented architectures [...] This explains why addition, subtraction and comparison of pointers are defined only for pointers into an array and one element beyond the end. [...] Users of machines with a nonsegmented address space developed idioms, however, that referred to the elements beyond the end of the array [...] was not portable to segmented architectures unless special effort was taken [...] Allowing [...] would be costly and serve few useful purposes.

108

answered Oct 09 '22 21:10

Shafik Yaghmour

The 8086 is a processor with 16 bit registers and a 20 bit address space. To cope with the lack of bits in its registers, a set of segment registers exists. On memory access, the dereferenced address is computed like this:

address = 16 * segment + register

Notice that among other things, an address has generally multiple ways to be represented. Comparing two arbitrary addresses is tedious as the compiler has to first normalize both addresses and then compare the normalized addresses.

Many compilers specify (in the memory models where this is possible) that when doing pointer arithmetic, the segment part is to be left untouched. This has several consequences:

objects can have a size of at most 64 kB
all addresses in an object have the same segment part
comparing addresses in an object can be done just by comparing the register part; that can be done in a single instruction

This fast comparison of course only works when the pointers are derived from the same base-address, which is one of the reasons why the C standard defines pointer comparisons only for when both pointers point into the same object.

If you want a well-ordered comparison for all pointers, consider converting the pointers to uintptr_t values first.

answered Oct 09 '22 21:10

fuz

Related questions
                            
                                Recursion with C
                            
                                Where are addresses of pointers stored in C?
                            
                                how to list USB mass storage devices programatically using libudev in Linux?
                            
                                How to export a struct between two kernel modules using EXPORT_SYMBOL or equivalent?
                            
                                System calls and EINTR error code
                            
                                Print n space characters -(f)printf format
                            
                                Where does the OS store argv and argc when a child process is executed?
                            
                                Error: incomplete type is not allowed
                            
                                Is SSE2 signed integer overflow undefined?
                            
                                Get Lua table size in C
                            
                                Difference between u8, uint8_t, __u8 and __be8
                            
                                Strange expression in the return statement
                            
                                warning: unknown escape sequence: '\040' [enabled by default]
                            
                                What is zalloc in embedded programming?
                            
                                How to cast / assign one enum value to another enum
                            
                                How do I print values from C extensions?
                            
                                What's the diffrence between \xFF and 0xFF
                            
                                How do I format decimals in C?
                            
                                Heap Overflow Attack
                            
                                How can my C code find the symbol corresponding to an address at run-time (in Linux)?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Rationale for pointer comparisons outside an array to be UB

Tags:

c

pointers

language-lawyer

undefined-behavior

Phil Miller

People also ask

2 Answers

Shafik Yaghmour

fuz

Recent Activity

Donate For Us