Can someone explain the following code output to me:
void myprint(unsigned long a)
{
printf("Input is %lx\n", a);
}
int main()
{
myprint(1 << 31);
myprint(0x80000000);
}
output with gcc main.c
:
Input is ffffffff80000000
Input is 80000000
Why is (1 << 31)
treated as signed and 0x80000000
is treated as unsigned?
Variables such as integers can be represent in two ways, i.e., signed and unsigned. Signed numbers use sign flag or can be distinguish between negative values and positive values. Whereas unsigned numbers stored only positive numbers but not negative numbers.
The "signed" indicator means that the item can hold positive or negative values. "Unsigned" doesn't distinguish between positive and negative values. A signed/unsigned variable can refer to any numerical data type (such as binary, integer, float, etc).
Recall: to increase the number of bits in a representation of an integer in two's complement, add copies of the leftmost bit (the sign bit) to the left until you have the desired number of bits. This is called sign extension.
Sign-extending means copying the sign bit of the unextended value to all bits on the left side of the larger-size value.
In C the result of an expression depends on the types of the operands (or some of the operands). Particularly, 1
is an int
(signed), therefore 1 << n
is also int
.
The type (including signed-ness) of 0x80000000
is determined by the rules here and it depends on the size of int
and other integer types on your system, which you haven't specified. A type is chosen such that 0x80000000
(a large positive number) is in range for that type.
In case you have any misconception: the literal 0x80000000
is a large positive number. People sometimes mistakenly equate it to a negative number, mixing up values with representations.
In your question you say "Why is 0x80000000 is treated as unsigned?". However your code does not actually rely on the signed-ness of 0x80000000
. The only thing you do with it is pass it to the function which takes unsigned long
parameter. So whether or not it is signed or unsigned doesn't matter; when passed to the conversion it is converted to an unsigned long
with the same value. (Since 0x80000000
is within the minimum guaranteed range for unsigned long
there is no chance of it being out of range).
So, that's 0x80000000
dealt with. What about 1 << 31
? If your system has 32-bit int (or narrower) this causes undefined behaviour due to signed arithmetic overflow. (Link to further reading). If your system has larger ints then this will produce the same output as the 0x80000000
line.
If you use 1u << 31
instead, and you have 32-bit ints, then there is no undefined behaviour and you are guaranteed to see the program output 80000000
twice.
Since your output was not 80000000
then we can conclude that your system has 32-bit (or narrower) int, and your program actually causes undefined behaviour. The type of 0x80000000
would be unsigned int
if int
is 32-bit, or unsigned long
otherwise.
Why is
(1 << 31)
treated as signed and0x80000000
is treated as unsigned?
From 6.5.7 Bitise shift operators in C11 specs:
3 The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. [...]
4 The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined
So, because 1
is an int
(From section 6.4.4.1 mentioned in following paragraph), 1 << 31
is also an int
for which the value is not well defined on systems where int
is less than or equal to 32
bits. (May even trap)
From 6.4.4.1 Integer constants
3 A decimal constant begins with a nonzero digit and consists of a sequence of decimal digits. An octal constant consists of the prefix 0 optionally followed by a sequence of the digits 0 through 7 only. A hexadecimal constant consists of the prefix 0x or 0X followed by a sequence of the decimal digits and the letters a (or A) through f (or F) with values 10 through 15 respectively.
and
5 The type of an integer constant is the first of the corresponding list in which its value can be represented.
Suffix | decimal Constant | Hex Constant ---------+------------------------------------+--------------------------- none | int | int | int | unsigned int | | long int | long int | unsigned long int | | long long int | long long int | unsigned long long int ---------+------------------------------------+--------------------------- u or U | unsigned int | unsigned int [...] | [...] | [...]
So, 0x80000000
on a system with 32
bit or lesser bits int
and 32
bit or larger unsigned int
is an unsigned int
,
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With