Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is -(-2147483648) = - 2147483648 in a 32-bit machine?

I think the question is self explanatory, I guess it probably has something to do with overflow but still I do not quite get it. What is happening, bitwise, under the hood?

Why does -(-2147483648) = -2147483648 (at least while compiling in C)?

like image 385
Lesscomfortable Avatar asked Feb 25 '17 22:02

Lesscomfortable


People also ask

What is the number 2147483648?

Integers.info - Binary numbers: 2147483648 = 10000000000000000000000000000000.

What is the 32-bit integer limit?

A 32-bit signed integer. It has a minimum value of -2,147,483,648 and a maximum value of 2,147,483,647 (inclusive).


2 Answers

Note: this answer does not apply as such on the obsolete ISO C90 standard that is still used by many compilers

First of all, on C99, C11, the expression -(-2147483648) == -2147483648 is in fact false:

int is_it_true = (-(-2147483648) == -2147483648); printf("%d\n", is_it_true); 

prints

0 

So how it is possible that this evaluates to true? The machine is using 32-bit two's complement integers. The 2147483648 is an integer constant that quite doesn't fit in 32 bits, thus it will be either long int or long long int depending on whichever is the first where it fits. This negated will result in -2147483648 - and again, even though the number -2147483648 can fit in a 32-bit integer, the expression -2147483648 consists of a >32-bit positive integer preceded with unary -!

You can try the following program:

#include <stdio.h>  int main() {     printf("%zu\n", sizeof(2147483647));     printf("%zu\n", sizeof(2147483648));     printf("%zu\n", sizeof(-2147483648)); } 

The output on such machine most probably would be 4, 8 and 8.

Now, -2147483648 negated will again result in +214783648, which is still of type long int or long long int, and everything is fine.

In C99, C11, the integer constant expression -(-2147483648) is well-defined on all conforming implementations.


Now, when this value is assigned to a variable of type int, with 32 bits and two's complement representation, the value is not representable in it - the values on 32-bit 2's complement would range from -2147483648 to 2147483647.

The C11 standard 6.3.1.3p3 says the following of integer conversions:

  • [When] the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

That is, the C standard doesn't actually define what the value in this case would be, or doesn't preclude the possibility that the execution of the program stops due to a signal being raised, but leaves it to the implementations (i.e. compilers) to decide how to handle it (C11 3.4.1):

implementation-defined behavior

unspecified behavior where each implementation documents how the choice is made

and (3.19.1):

implementation-defined value

unspecified value where each implementation documents how the choice is made


In your case, the implementation-defined behaviour is that the value is the 32 lowest-order bits [*]. Due to the 2's complement, the (long) long int value 0x80000000 has the bit 31 set and all other bits cleared. In 32-bit two's complement integers the bit 31 is the sign bit - meaning that the number is negative; all value bits zeroed means that the value is the minimum representable number, i.e. INT_MIN.


[*] GCC documents its implementation-defined behaviour in this case as follows:

The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3).

For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.


Negating an (unsuffixed) integer constant:

The expression -(-2147483648) is perfectly defined in C, however it may be not obvious why it is this way.

When you write -2147483648, it is formed as unary minus operator applied to integer constant. If 2147483648 can't be expressed as int, then it s is represented as long or long long* (whichever fits first), where the latter type is guaranteed by the C Standard to cover that value.

To confirm that, you could examine it by:

printf("%zu\n", sizeof(-2147483648)); 

which yields 8 on my machine.

The next step is to apply second - operator, in which case the final value is 2147483648L (assuming that it was eventually represented as long). If you try to assign it to int object, as follows:

int n = -(-2147483648); 

then the actual behavior is implementation-defined. Referring to the Standard:

C11 §6.3.1.3/3 Signed and unsigned integers

Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

The most common way is to simply cut-off the higher bits. For instance, GCC documents it as:

For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.

Conceptually, the conversion to type of width 32 can be illustrated by bitwise AND operation:

value & (2^32 - 1) // preserve 32 least significant bits 

In accordance with two's complement arithmetic, the value of n is formed with all zeros and MSB (sign) bit set, which represents value of -2^31, that is -2147483648.

Negating an int object:

If you try to negate int object, that holds value of -2147483648, then assuming two's complement machine, the program will exhibit undefined behavior:

n = -n; // UB if n == INT_MIN and INT_MAX == 2147483647 

C11 §6.5/5 Expressions

If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.

Additional references:

  • INT32-C. Ensure that operations on signed integers do not result in overflow

*) In withdrawed C90 Standard, there was no long long type and the rules were different. Specifically, sequence for unsuffixed decimal was int, long int, unsigned long int (C90 §6.1.3.2 Integer constants).

†) This is due to LLONG_MAX, which must be at least +9223372036854775807 (C11 §5.2.4.2.1/1).

like image 159
Grzegorz Szpetkowski Avatar answered Oct 10 '22 06:10

Grzegorz Szpetkowski