C++20 will specify that signed integral types must use two's complement. This doesn't seem like a big change given that (virtually?) every implementation currently uses two's complement.
But I was wondering if this change might shift some "undefined behaviors" to be "implementation defined" or even "defined."
Consider, the absolute value function, std::abs(int)
and some of its overloads. The C++ standard includes this function by reference to the C standard, which says that the behavior is undefined if the result cannot be represented.
In two's complement, there is no positive counterpart to INT_MIN
:
abs(INT_MIN) == -INT_MIN == undefined behavior
In sign-magnitude representation, there is:
-INT_MIN == INT_MAX
Thus it seemed reasonable that abs()
was left with some undefined behavior.
Once two's complement is required, it would seem to make sense that abs(INT_MIN)
's behavior could be fully specified or, at least, implementation defined, without any issue of backward compatibility. But I don't see any such change proposed.
The only drawback I see is that the C++ Standard would need to specify abs()
explicitly rather than referencing the C Standard's description of abs()
. (As far as I know, C is not mandating two's complement.)
Was this just not a priority for the committee or are there still reasons not to take advantage of the simplification and certainty that the two's complement mandate provides?
Two's complement is awesome - that's why everyone uses it. The biggest disadvantage is that if you try to negate the lowest representable value, you get an overflow. With one's complement or sign and magnitude, that doesn't happen.
Compared to other systems for representing signed numbers (e.g., ones' complement), the two's complement has the advantage that the fundamental arithmetic operations of addition, subtraction, and multiplication are identical to those for unsigned binary numbers (as long as the inputs are represented in the same number ...
Does computer always follow 2's complement method to represent negative number? No. Some computers used 1's compliment (where ~1 == -0), some used "sign and magnitude" (where ~1 == -127), some use "bias" (where there signed value is "unsigned value - bias" and where ~1 == 127).
C and C++ as specified, however, are not two's complement. Signed integers currently allow the existence of an extraordinary value which traps, extra padding bits, integral negative zero, and introduce undefined behavior and implementation-defined behavior for the sake of this extremely abstract machine.
One of the specific questions considered by the committee was what to do about -INT_MIN
, and the results of that poll were:
addition / subtraction / multiplication and
-INT_MIN
overflow is currently undefined behavior, it should instead be:4: wrap
6: wrap or trap
5: intermediate values are mathematical integers
14: status quo (remain undefined behavior)
This was explicitly considered and people felt that the best option was keeping it undefined behavior.
To clarify on "intermediate values are mathematical integers", there is a other part of the paper which clarifies that means that (int)a + (int)b > INT_MAX
might be true.
Note that implementations are free to define specific behavior in these cases if they so choose. I don't know if any of them do.
The Committee that wrote C89 deliberately avoided any judgments about things that quality implementations "should" do when practical. The published Rationale indicates that they expected implementations to behave usefully in circumstances beyond those required by the Standard (and in the case of integer overflow, even documents some very specific expectations), but for whatever reason the Committee deliberately avoided saying such things within the Standard itself.
When later C or C++ committees added new features, they were willing to consider the possibility that they might be supportable on some platforms and unsupportable on others, but there has almost never been any effort to revisit questions of whether the Standard should recognize cases where many implementations would process code in the same useful and consistent fashion even though the Standard had imposed no requirements, and provide a means by which a program could test whether an implementation supports such behavior, refuse to compile on one that doesn't, and have defined behavior on those that do.
The net effect is that something like: unsigned mul_mod_65536(unsigned short x, unsigned short y) { return (x*y) & 0xFFFFu; }
may arbitrarily disrupt the behavior of calling code if the arithmetical value of x*y
is between INT_MAX+1u
and UINT_MAX
even though that would be a situation that the authors of the Standard said they expected to be processed consistently by most implementations. The recent Standard have eliminated the main reason the authors of C89 would have expected that some implementations might process the aforementioned function strangely, but that doesn't mean that implementations haven't decided to treat it weirdly in ways the authors of C89 could never have imagined, and would never knowingly have allowed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With