C++20 will specify that signed integral types must use two's complement. This doesn't seem like a big change given that (virtually?) every implementation currently uses two's complement. But I was wondering if this change might shift some "undefined behaviors" to be "implementation defined" or even "defined." Consider, the absolute value function, <code>std::abs(int)</code> and some of its overloads. The C++ standard includes this function by reference to the C standard, which says that the behavior is undefined if the result cannot be represented. In two's complement, there is no positive counterpart to <code>INT_MIN</code>: <pre class="prettyprint"><code>abs(INT_MIN) == -INT_MIN == undefined behavior </code></pre> In sign-magnitude representation, there is: <pre class="prettyprint"><code>-INT_MIN == INT_MAX </code></pre> Thus it seemed reasonable that <code>abs()</code> was left with some undefined behavior. Once two's complement is required, it would seem to make sense that <code>abs(INT_MIN)</code>'s behavior could be fully specified or, at least, implementation defined, without any issue of backward compatibility. But I don't see any such change proposed. The only drawback I see is that the C++ Standard would need to specify <code>abs()</code> explicitly rather than referencing the C Standard's description of <code>abs()</code>. (As far as I know, C is not mandating two's complement.) Was this just not a priority for the committee or are there still reasons not to take advantage of the simplification and certainty that the two's complement mandate provides?

One of the specific questions considered by the committee was what to do about <code>-INT_MIN</code>, and the results of that poll were: <blockquote> addition / subtraction / multiplication and <code>-INT_MIN</code> overflow is currently undefined behavior, it should instead be: 4: wrap 6: wrap or trap 5: intermediate values are mathematical integers 14: status quo (remain undefined behavior) </blockquote> This was explicitly considered and people felt that the best option was keeping it undefined behavior. To clarify on "intermediate values are mathematical integers", there is a other part of the paper which clarifies that means that <code>(int)a + (int)b > INT_MAX</code> might be true. <hr> Note that implementations are free to define specific behavior in these cases if they so choose. I don't know if any of them do.

Ramifications of C++20 requiring two's complement

Tags:

c++

undefined-behavior

twos-complement

c++20

C++20 will specify that signed integral types must use two's complement. This doesn't seem like a big change given that (virtually?) every implementation currently uses two's complement.

But I was wondering if this change might shift some "undefined behaviors" to be "implementation defined" or even "defined."

Consider, the absolute value function, std::abs(int) and some of its overloads. The C++ standard includes this function by reference to the C standard, which says that the behavior is undefined if the result cannot be represented.

In two's complement, there is no positive counterpart to INT_MIN:

abs(INT_MIN) == -INT_MIN == undefined behavior

In sign-magnitude representation, there is:

-INT_MIN == INT_MAX

Thus it seemed reasonable that abs() was left with some undefined behavior.

Once two's complement is required, it would seem to make sense that abs(INT_MIN)'s behavior could be fully specified or, at least, implementation defined, without any issue of backward compatibility. But I don't see any such change proposed.

The only drawback I see is that the C++ Standard would need to specify abs() explicitly rather than referencing the C Standard's description of abs(). (As far as I know, C is not mandating two's complement.)

Was this just not a priority for the committee or are there still reasons not to take advantage of the simplification and certainty that the two's complement mandate provides?

896

asked Aug 05 '19 17:08

Adrian McCarthy

2 Answers

One of the specific questions considered by the committee was what to do about -INT_MIN, and the results of that poll were:

addition / subtraction / multiplication and -INT_MIN overflow is currently undefined behavior, it should instead be:

4: wrap
6: wrap or trap
5: intermediate values are mathematical integers
14: status quo (remain undefined behavior)

This was explicitly considered and people felt that the best option was keeping it undefined behavior.

To clarify on "intermediate values are mathematical integers", there is a other part of the paper which clarifies that means that (int)a + (int)b > INT_MAX might be true.

Note that implementations are free to define specific behavior in these cases if they so choose. I don't know if any of them do.

156

answered Sep 24 '22 15:09

Barry

The Committee that wrote C89 deliberately avoided any judgments about things that quality implementations "should" do when practical. The published Rationale indicates that they expected implementations to behave usefully in circumstances beyond those required by the Standard (and in the case of integer overflow, even documents some very specific expectations), but for whatever reason the Committee deliberately avoided saying such things within the Standard itself.

When later C or C++ committees added new features, they were willing to consider the possibility that they might be supportable on some platforms and unsupportable on others, but there has almost never been any effort to revisit questions of whether the Standard should recognize cases where many implementations would process code in the same useful and consistent fashion even though the Standard had imposed no requirements, and provide a means by which a program could test whether an implementation supports such behavior, refuse to compile on one that doesn't, and have defined behavior on those that do.

The net effect is that something like: unsigned mul_mod_65536(unsigned short x, unsigned short y) { return (x*y) & 0xFFFFu; } may arbitrarily disrupt the behavior of calling code if the arithmetical value of x*y is between INT_MAX+1u and UINT_MAX even though that would be a situation that the authors of the Standard said they expected to be processed consistently by most implementations. The recent Standard have eliminated the main reason the authors of C89 would have expected that some implementations might process the aforementioned function strangely, but that doesn't mean that implementations haven't decided to treat it weirdly in ways the authors of C89 could never have imagined, and would never knowingly have allowed.

answered Sep 22 '22 15:09

supercat

Related questions
                            
                                Image recognition of well defined but changing angle image
                            
                                std::array constructor inheritance
                            
                                Why does Foo({}) invoke Foo(0) instead of Foo()?
                            
                                Shift masked bits to the lsb
                            
                                enable class's member depending on template
                            
                                Simple std::regex_search() code won't compile with Apple clang++ -std=c++14
                            
                                Finding minimum element based on a transformed value
                            
                                openssl/ssl.h not found but installed with homebrew
                            
                                Syntax issue when populating an array with a fold expression
                            
                                Why string_view instead of generalized container_view<T>?
                            
                                How to pass a delegate or function pointer from C# to C++ and call it there using InternalCall
                            
                                Something like "if constexpr" but for class definition
                            
                                Using strings in switch statements - where do we stand with C++17?
                            
                                Using MMU to implement resizable arrays
                            
                                `std::complex<T>[n]` and `T[n*2]` type aliasing
                            
                                Does it make sense to check for nullptr in custom deleter of shared_ptr?
                            
                                How std::conditional works
                            
                                Contiguous storage of polymorphic types
                            
                                Is "int (x), 1;" an ambiguous statement?
                            
                                Why and when do I need to supply my own deleter?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With