After reading this question on signed/unsigned compares (they come up every couple of days I'd say):
I wondered why we don't have proper signed unsigned compares and instead this horrible mess? Take the output from this small program:
#include <stdio.h>
#define C(T1,T2)\
{signed T1 a=-1;\
unsigned T2 b=1;\
printf("(signed %5s)%d < (unsigned %5s)%d = %d\n",#T1,(int)a,#T2,(int)b,(a<b));}\
#define C1(T) printf("%s:%d\n",#T,(int)sizeof(T)); C(T,char);C(T,short);C(T,int);C(T,long);
int main()
{
C1(char); C1(short); C1(int); C1(long);
}
Compiled with my standard compiler (gcc, 64bit), I get this:
char:1
(signed char)-1 < (unsigned char)1 = 1
(signed char)-1 < (unsigned short)1 = 1
(signed char)-1 < (unsigned int)1 = 0
(signed char)-1 < (unsigned long)1 = 0
short:2
(signed short)-1 < (unsigned char)1 = 1
(signed short)-1 < (unsigned short)1 = 1
(signed short)-1 < (unsigned int)1 = 0
(signed short)-1 < (unsigned long)1 = 0
int:4
(signed int)-1 < (unsigned char)1 = 1
(signed int)-1 < (unsigned short)1 = 1
(signed int)-1 < (unsigned int)1 = 0
(signed int)-1 < (unsigned long)1 = 0
long:8
(signed long)-1 < (unsigned char)1 = 1
(signed long)-1 < (unsigned short)1 = 1
(signed long)-1 < (unsigned int)1 = 1
(signed long)-1 < (unsigned long)1 = 0
If I compile for 32 bit, the result is the same except that:
long:4
(signed long)-1 < (unsigned int)1 = 0
The "How?" of all this is easy to find: Just goto section 6.3 of the C99 standard or chapter 4 of C++ and dig up the clauses which describe how the operands are converted to a common type and this can break if the common type reinterprets negative values.
But what about the "Why?". As we can see, the '<' fails in 50% of all cases, also it depends on the concrete sizes of the types, so it is platform dependent. Here are some points to consider:
The convert & compare process is not really a prime example for the Rule of Least Surprise
I don't believe that there is code out there, which relies on the proposition that (short)-1 > (unsigned)1
and is not written by terrorists.
This is all terrible when you're in C++ with template code, because you need type trait magic to knit a correct "<".
After all, comparing signed and unsigned value of different types is easy to implement:
signed X < unsigned Y -> (a<(X)0) || ((Z)a<(Z)b) where Z=X|Y
The pre-check is cheap and can also be optimized away by the compiler if a>=0 can statically be proven.
So here's my question:
Would it break the language or existing code if we'd add safe signed/unsigned compares to C/C++?
("Would it break the language" means would we need to make massive changes to different parts of the language to accommodate this change)
UPDATE: I've ran this on my good old Turbo-C++ 3.0 and got this output:
char:1
(signed char)-1 < (unsigned char)1 = 0
Why is (signed char)-1 < (unsigned char) == 0
here?
Well the basic answer is – nothing. No bits are changed, the compiler just treats the bit representation as unsigned. For example, let us assume that the compiler represents signed integers using 2's complement notation (this is the norm – but is *not* mandated by the C language).
The result is of the same unsigned type. If you mix types of different rank, the higher-ranked type "wins", if it can represent all values of lower-ranked type. The calculations are performed within that type.
The hardware is designed to compare signed to signed and unsigned to unsigned. If you want the arithmetic result, convert the unsigned value to a larger signed type first. Otherwise the compiler wil assume that the comparison is really between unsigned values. And -1 is represented as 1111..
Sure, comparisons between signed and unsigned would be slower, but their result would be more correct in some sense. @Nawaz Your bottom line conclusion is incorrect, unfortunately: if the signed type can contain the unsigned type, the unsigned will be converted to the signed type and not the opposite.
My answer is for C only.
There is no type in C that can accomodate all possible values of all possible integer types. The closest C99 comes to this is intmax_t
and uintmax_t
, and their intersection only covers half their respective range.
Therefore, you cannot implement a mathematical value comparison of such as x <= y
by first converting x
and y
to a common type and then doing a simple operation. This is a major departure from a general principle of how operators work. It also breaks the intuition that operators correspond to things that tend to be single instructions in common hardware.
Even if you added this additional complexity to the language (and extra burden to implementation writers), it wouldn't have very nice properties. For example, x <= y
would still not be equivalent to x - y <= 0
. If you wanted all these nice properties, you'd have to make arbitrary-sized integers part of the language.
I'm sure there's plenty of old unix code out there, possibly some running on your machine, that assumes that (int)-1 > (unsigned)1
. (Ok, maybe it was written by freedom fighters ;-)
If you want lisp/haskell/python/$favorite_language_with_bignums_built_in, you know where to find it...
Yes it would break the language/existing code. The language, as you have noted, carefully specifies the behavior when signed and unsigned operands are used together. This behavior with comparison operators is essential for some important idioms, like:
if (x-'0' < 10U)
Not to mention things like (equality comparison):
size_t l = mbrtowc(&wc, s, n, &state);
if (l==-1) ... /* Note that mbrtowc returns (size_t)-1 on failure */
As an aside, specifying "natural" behavior for mixed signed/unsigned comparisons would also incur a significant performance penalty, even in programs which are presently using such comparisons in safe ways where they already have their "natural" behavior due to constraints on the input which the compiler would have a hard time determining (or might not be able to determine at all). In writing your own code to handle these tests, I'm sure you've already seen what the performance penalty would look like, and it's not pretty.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With