Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ramifications of C++20 requiring two's complement

C++20 will specify that signed integral types must use two's complement. This doesn't seem like a big change given that (virtually?) every implementation currently uses two's complement.

But I was wondering if this change might shift some "undefined behaviors" to be "implementation defined" or even "defined."

Consider, the absolute value function, std::abs(int) and some of its overloads. The C++ standard includes this function by reference to the C standard, which says that the behavior is undefined if the result cannot be represented.

In two's complement, there is no positive counterpart to INT_MIN:

abs(INT_MIN) == -INT_MIN == undefined behavior

In sign-magnitude representation, there is:

-INT_MIN == INT_MAX

Thus it seemed reasonable that abs() was left with some undefined behavior.

Once two's complement is required, it would seem to make sense that abs(INT_MIN)'s behavior could be fully specified or, at least, implementation defined, without any issue of backward compatibility. But I don't see any such change proposed.

The only drawback I see is that the C++ Standard would need to specify abs() explicitly rather than referencing the C Standard's description of abs(). (As far as I know, C is not mandating two's complement.)

Was this just not a priority for the committee or are there still reasons not to take advantage of the simplification and certainty that the two's complement mandate provides?

like image 896
Adrian McCarthy Avatar asked Aug 05 '19 17:08

Adrian McCarthy


People also ask

What are the disadvantages of 2s complement?

Two's complement is awesome - that's why everyone uses it. The biggest disadvantage is that if you try to negate the lowest representable value, you get an overflow. With one's complement or sign and magnitude, that doesn't happen.

What is the main advantage of the 2's complement?

Compared to other systems for representing signed numbers (e.g., ones' complement), the two's complement has the advantage that the fundamental arithmetic operations of addition, subtraction, and multiplication are identical to those for unsigned binary numbers (as long as the inputs are represented in the same number ...

Do all computers use twos complement?

Does computer always follow 2's complement method to represent negative number? No. Some computers used 1's compliment (where ~1 == -0), some used "sign and magnitude" (where ~1 == -127), some use "bias" (where there signed value is "unsigned value - bias" and where ~1 == 127).

Does C++ use 2s complement?

C and C++ as specified, however, are not two's complement. Signed integers currently allow the existence of an extraordinary value which traps, extra padding bits, integral negative zero, and introduce undefined behavior and implementation-defined behavior for the sake of this extremely abstract machine.


2 Answers

One of the specific questions considered by the committee was what to do about -INT_MIN, and the results of that poll were:

addition / subtraction / multiplication and -INT_MIN overflow is currently undefined behavior, it should instead be:

4: wrap
6: wrap or trap
5: intermediate values are mathematical integers
14: status quo (remain undefined behavior)

This was explicitly considered and people felt that the best option was keeping it undefined behavior.

To clarify on "intermediate values are mathematical integers", there is a other part of the paper which clarifies that means that (int)a + (int)b > INT_MAX might be true.


Note that implementations are free to define specific behavior in these cases if they so choose. I don't know if any of them do.

like image 156
Barry Avatar answered Sep 24 '22 15:09

Barry


The Committee that wrote C89 deliberately avoided any judgments about things that quality implementations "should" do when practical. The published Rationale indicates that they expected implementations to behave usefully in circumstances beyond those required by the Standard (and in the case of integer overflow, even documents some very specific expectations), but for whatever reason the Committee deliberately avoided saying such things within the Standard itself.

When later C or C++ committees added new features, they were willing to consider the possibility that they might be supportable on some platforms and unsupportable on others, but there has almost never been any effort to revisit questions of whether the Standard should recognize cases where many implementations would process code in the same useful and consistent fashion even though the Standard had imposed no requirements, and provide a means by which a program could test whether an implementation supports such behavior, refuse to compile on one that doesn't, and have defined behavior on those that do.

The net effect is that something like: unsigned mul_mod_65536(unsigned short x, unsigned short y) { return (x*y) & 0xFFFFu; } may arbitrarily disrupt the behavior of calling code if the arithmetical value of x*y is between INT_MAX+1u and UINT_MAX even though that would be a situation that the authors of the Standard said they expected to be processed consistently by most implementations. The recent Standard have eliminated the main reason the authors of C89 would have expected that some implementations might process the aforementioned function strangely, but that doesn't mean that implementations haven't decided to treat it weirdly in ways the authors of C89 could never have imagined, and would never knowingly have allowed.

like image 37
supercat Avatar answered Sep 22 '22 15:09

supercat