I think the question is self explanatory, I guess it probably has something to do with overflow but still I do not quite get it. What is happening, bitwise, under the hood?
Why does -(-2147483648) = -2147483648
(at least while compiling in C)?
Integers.info - Binary numbers: 2147483648 = 10000000000000000000000000000000.
A 32-bit signed integer. It has a minimum value of -2,147,483,648 and a maximum value of 2,147,483,647 (inclusive).
Note: this answer does not apply as such on the obsolete ISO C90 standard that is still used by many compilers
First of all, on C99, C11, the expression -(-2147483648) == -2147483648
is in fact false:
int is_it_true = (-(-2147483648) == -2147483648); printf("%d\n", is_it_true);
prints
0
So how it is possible that this evaluates to true? The machine is using 32-bit two's complement integers. The 2147483648
is an integer constant that quite doesn't fit in 32 bits, thus it will be either long int
or long long int
depending on whichever is the first where it fits. This negated will result in -2147483648
- and again, even though the number -2147483648
can fit in a 32-bit integer, the expression -2147483648
consists of a >32-bit positive integer preceded with unary -
!
You can try the following program:
#include <stdio.h> int main() { printf("%zu\n", sizeof(2147483647)); printf("%zu\n", sizeof(2147483648)); printf("%zu\n", sizeof(-2147483648)); }
The output on such machine most probably would be 4, 8 and 8.
Now, -2147483648
negated will again result in +214783648
, which is still of type long int
or long long int
, and everything is fine.
In C99, C11, the integer constant expression -(-2147483648)
is well-defined on all conforming implementations.
Now, when this value is assigned to a variable of type int
, with 32 bits and two's complement representation, the value is not representable in it - the values on 32-bit 2's complement would range from -2147483648 to 2147483647.
The C11 standard 6.3.1.3p3 says the following of integer conversions:
- [When] the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
That is, the C standard doesn't actually define what the value in this case would be, or doesn't preclude the possibility that the execution of the program stops due to a signal being raised, but leaves it to the implementations (i.e. compilers) to decide how to handle it (C11 3.4.1):
implementation-defined behavior
unspecified behavior where each implementation documents how the choice is made
and (3.19.1):
implementation-defined value
unspecified value where each implementation documents how the choice is made
In your case, the implementation-defined behaviour is that the value is the 32 lowest-order bits [*]. Due to the 2's complement, the (long) long int value 0x80000000
has the bit 31 set and all other bits cleared. In 32-bit two's complement integers the bit 31 is the sign bit - meaning that the number is negative; all value bits zeroed means that the value is the minimum representable number, i.e. INT_MIN
.
[*] GCC documents its implementation-defined behaviour in this case as follows:
The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3).
For conversion to a type of width
N
, the value is reduced modulo2^N
to be within range of the type; no signal is raised.
The expression -(-2147483648)
is perfectly defined in C, however it may be not obvious why it is this way.
When you write -2147483648
, it is formed as unary minus operator applied to integer constant. If 2147483648
can't be expressed as int
, then it s is represented as long
or long long
* (whichever fits first), where the latter type is guaranteed by the C Standard to cover that value†.
To confirm that, you could examine it by:
printf("%zu\n", sizeof(-2147483648));
which yields 8
on my machine.
The next step is to apply second -
operator, in which case the final value is 2147483648L
(assuming that it was eventually represented as long
). If you try to assign it to int
object, as follows:
int n = -(-2147483648);
then the actual behavior is implementation-defined. Referring to the Standard:
C11 §6.3.1.3/3 Signed and unsigned integers
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
The most common way is to simply cut-off the higher bits. For instance, GCC documents it as:
For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.
Conceptually, the conversion to type of width 32 can be illustrated by bitwise AND operation:
value & (2^32 - 1) // preserve 32 least significant bits
In accordance with two's complement arithmetic, the value of n
is formed with all zeros and MSB (sign) bit set, which represents value of -2^31
, that is -2147483648
.
int
object:If you try to negate int
object, that holds value of -2147483648
, then assuming two's complement machine, the program will exhibit undefined behavior:
n = -n; // UB if n == INT_MIN and INT_MAX == 2147483647
C11 §6.5/5 Expressions
If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.
*) In withdrawed C90 Standard, there was no long long
type and the rules were different. Specifically, sequence for unsuffixed decimal was int
, long int
, unsigned long int
(C90 §6.1.3.2 Integer constants).
†) This is due to LLONG_MAX
, which must be at least +9223372036854775807
(C11 §5.2.4.2.1/1).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With