So, the standard (referring to N1570) says the following about comparing pointers:
C99 6.5.8/5 Relational operators
When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. ... [snip obvious definitions of comparison within aggregates] ... In all other cases, the behavior is undefined.
What is the rationale for this instance of UB, as opposed to specifying (for instance) conversion to intptr_t
and comparison of that?
Is there some machine architecture where a sensible total ordering on pointers is hard to construct? Is there some class of optimization or analysis that unrestricted pointer comparisons would impede?
A deleted answer to this question mentions that this piece of UB allows for skipping comparison of segment registers and only comparing offsets. Is that particularly valuable to preserve?
(That same deleted answer, as well as one here, note that in C++, std::less
and the like are required to implement a total order on pointers, whether the normal comparison operator does or not.)
When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two pointers to object types both point to the same object, or both point one past the last element of the same array object, they compare equal.
We can compare pointers if they are pointing to the same array. Relational pointers can be used to compare two pointers. Pointers can't be multiplied or divided.
Relational Operators Relational operations are often used to compare the values of the variable based on which we can take decisions. The given table shows the relational operators that can be performed on pointer variables. The value of the relational expression is either 0 or 1 that is false or true.
Various comments in the ub mailing list discussion Justification for < not being a total order on pointers? strongly allude to segmented architectures being the reason. Including the follow comments, 1:
Separately, I believe that the Core Language should simply recognize the fact that all machines these days have a flat memory model.
and 2:
Then we maybe need an new type that guarantees a total order when converted from a pointer (e.g. in segmented architectures, conversion would require taking the address of the segment register and adding the offset stored in the pointer).
and 3:
Pointers, while historically not totally ordered, are practically so for all systems in existence today, with the exception of the ivory tower minds of the committee, so the point is moot.
and 4:
But, even if segmented architectures, unlikely though it is, do come back, the ordering problem still has to be addressed, as std::less is required to totally order pointers. I just want operator< to be an alternate spelling for that property.
Why should everyone else pretend to suffer (and I do mean pretend, because outside of a small contingent of the committee, people already assume that pointers are totally ordered with respect to operator<) to meet the theoretical needs of some currently non-existent architecture?
Counter to the trend of comments from the ub mailing list, FUZxxl points out that supporting DOS is a reason not to support totally ordered pointers.
Update
This is also supported by the Annotated C++ Reference Manual(ARM) which says this was due to burden of supporting this on segmented architectures:
The expression may not evaluate to false on segmented architectures [...] This explains why addition, subtraction and comparison of pointers are defined only for pointers into an array and one element beyond the end. [...] Users of machines with a nonsegmented address space developed idioms, however, that referred to the elements beyond the end of the array [...] was not portable to segmented architectures unless special effort was taken [...] Allowing [...] would be costly and serve few useful purposes.
The 8086 is a processor with 16 bit registers and a 20 bit address space. To cope with the lack of bits in its registers, a set of segment registers exists. On memory access, the dereferenced address is computed like this:
address = 16 * segment + register
Notice that among other things, an address has generally multiple ways to be represented. Comparing two arbitrary addresses is tedious as the compiler has to first normalize both addresses and then compare the normalized addresses.
Many compilers specify (in the memory models where this is possible) that when doing pointer arithmetic, the segment part is to be left untouched. This has several consequences:
This fast comparison of course only works when the pointers are derived from the same base-address, which is one of the reasons why the C standard defines pointer comparisons only for when both pointers point into the same object.
If you want a well-ordered comparison for all pointers, consider converting the pointers to uintptr_t
values first.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With