I am currently trying to wrap my head around bitwise and bit shift operators in Java. Although they make sense to me in simplified toy examples (basically positive integers), my understanding falls apart as soon as negatives are involved, and in some other cases. I tried searching all over the Internet with two search engines and I even checked the Java specification. I can't find any source that properly describes how bitwise and bit shift operators work in Java.
One function in the Java standard library that is especially confusing to me is java.lang.Integer.toUnsignedLong(int)
. The source from OpenJdk is shown here (LGPLv2 with classpath exception), with an excerpt in the Javadoc:
/**
* Converts the argument to a {@code long} by an unsigned
* conversion. In an unsigned conversion to a {@code long}, the
* high-order 32 bits of the {@code long} are zero and the
* low-order 32 bits are equal to the bits of the integer
* argument.
*/
public static long toUnsignedLong(int x) {
return ((long) x) & 0xffffffffL;
}
According to the official documentation reproduced above, "the high-order 32 bits of the long are zero and the low-order 32 bits are equal to the bits of the integer argument." I do not see how this follows from the code inside the method body however.
When reading the method, the following is my train of thought for positive x:
0xffffffff
has all ones in the lowest-order 4 bytes, and because only these bytes will have data in them, this mask has no effect and the correct result is returned.When reading it in the context of a negative x
however, my understanding falls apart:
0xffffffff
has all ones in the lowest-order 4 bytes and zero in the highest-order four bytes, it has the only effect of changing the sign bit on the long, and keeps the incorrect integer in the four least significant bits intact. As such, it returns a wrong answer from this method, where the sign bit of the integer is changed as it moves into the long.When I test this method, however, I get results consistent with the Javadoc. I suspect that I am misunderstanding one or more fundamental points about bitwise operators in Java or its two's complement integer representation, and I hope that this question can clarify those points.
Bitwise OR (|) The bitwise OR operator ( | ) returns a 1 in each bit position for which the corresponding bits of either or both operands are 1 s.
Bitwise NOT (~)Each bit in the operand is inverted in the result. The 32-bit signed integer operand is inverted according to two's complement. That is, the presence of the most significant bit is used to express negative integers. Bitwise NOTing any number x yields -(x + 1) .
Bitwise Complement (~) It is a unary operator denoted by the symbol ~ (pronounced as the tilde). It returns the inverse or complement of the bit. It makes every 0 a 1 and every 1 a 0. Let's use the bitwise complement operator in a Java program.
The bitwise operators work exactly as you would expect. They are strict bit-operators and do not consider semantics of bits at all.
Sometimes it is easiest to run through code using breakpoints. For your specific example, I converted the steps of the operation into atomic statements and printed the results with Long.toString
.
int x = -57;
// step 1:
long xCast = (long) x;
System.out.println(Long.toString(xCast, 2)); // -1110011 - this is not the bitwise representation however.
long mask = 0xffffffffL;
System.out.println(Long.toString(mask, 2)); // 11111111111111111111111111111111
// step 2:
long result = ((long) x) & mask;
System.out.println(Long.toString(result, 2)); // 11111111111111111111111111000111
Step 1 is the primary reason the operation looks as it does. In Java, all (strictily numeric) values are signed (chars are unsigned). This means that, as you correctly stated, all highest bits are sign-bits. However the interesting part is what the rest of the bits do, if a number is negative. The following thread already covered the basics of the 'Two's complement': What is “2's Complement”? So does this wikipedia-page: https://en.wikipedia.org/wiki/Two%27s_complement
To make it short, in java, for integers:
int zero = 0; // == 0b00000000_00000000_00000000_00000000
int maxPositive = Integer.MAX_VALUE; // == 0b01111111_11111111_11111111_11111111
int minus1 = -1; // == 0b11111111_11111111_11111111_11111111
int minNegative = Integer.MIN_VALUE; // == 0b10000000_00000000_00000000_00000000
So the reason everything works out is because if the integer is negative, when it is cast, the entire upper 32 bits are converted to 1s, because otherwise the represented value of the number would change. effectively:
int x = 0b11111111_11111111_11111111_11000111;
is cast to:
long xCast = 0b11111111_11111111_11111111_11111111_11111111_11111111_11111111_11000111;
Because you as the developer expect the method to return only the initially set bits, you have to mask the upper bits out of the result. This is done in step 2.
So the answer to your example: The representation of non-floating values in Java are two's complement, and therefore, when smart-casting a value from int to long, the upper bits are filled up with 1s for negative numbers. Thus they have to be removed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With