Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Java's bit shift operator work under the hood?

I didn't study IT, and only very recently came across bit shifts and an application for two's complement. So, can you please use simple English in your explanations and assume I know hardly anything about IP addresses, bit operations, and Java datatypes?

Today, I found the following piece of code (abbreviated):

long m = (-1) << (byte) 16;

Now, this is for IP subnet masking. I know I need to start out with 4 blocks of 8 bits (i.e. 4 bytes), and all bits have to be "switched on": 11111111 11111111 1111111 1111111 Next, zeros are shifted in from the right, in this case 16 bits' worth; so we get 11111111 11111111 00000000 0000000, the mask.

But I do have a few questions:

  1. Does the 16 have to be of type byte for this to work?
  2. The result is of type long. When the expression above runs, -1 gets converted into - effectively - 4x8bit blocks. How does Java know it needs 32 positions/bits (an IP address' length), and not, say, 16, or 8, when applying two's complement? (I'm guessing that has to do with the long datatype?)
  3. Why is two's complement applied to -1 to begin with? (Google gives you -0b1 if you ask it what -1 is in binary. I first thought it might be to do with overflow, but it isn't, is it...?)
  4. Really, what datatypes does the compiler convert this to while it's running the code, to make it all work?

UPDATE: The 16 is produced at runtime by a method; I just put a constant in here as an example. In hindsight probably a bad idea...

like image 593
Christian Avatar asked Sep 08 '15 15:09

Christian


People also ask

How does bitwise Shift Work?

The bitwise shift operators move the bit values of a binary object. The left operand specifies the value to be shifted. The right operand specifies the number of positions that the bits in the value are to be shifted. The result is not an lvalue.

How does a right shift operator work?

The right shift operator ( >> ) returns the signed number represented by the result of performing a sign-extending shift of the binary representation of the first operand (evaluated as a two's complement bit string) to the right by the number of bits, modulo 32, specified in the second operand.

How does left shift operator work?

The left shift operator ( << ) shifts the first operand the specified number of bits, modulo 32, to the left. Excess bits shifted off to the left are discarded. Zero bits are shifted in from the right.

What does bit shifting by 1 do?

Bitshifting shifts the binary representation of each pixel to the left or to the right by a pre-defined number of positions. Shifting a binary number by one bit is equivalent to multiplying (when shifting to the left) or dividing (when shifting to the right) the number by 2.


2 Answers

Really, what datatypes does the compiler convert this to while it's running the code, to make it all work?

This

(-1) << (byte) 16;

is a constant expression. Its value is known at compile time. It's a long with the value -65536 (in decimal representation).

If the expression wasn't a constant expression, the type of the variable wouldn't matter when evaluating the expression. It would only matter later when its value is assigned to the variable.

Take for example

int i = -1;
long m = i << (byte) 16;

The expression above is one that involves a shift operator and two operands, one of type int and another of type byte.

The JLS states the following concerning shift operators and their operands

Unary numeric promotion (§5.6.1) is performed on each operand separately.

which is

Otherwise, if the operand is of compile-time type byte, short, or char, it is promoted to a value of type int by a widening primitive conversion (§5.1.2).

So the byte value is widened to an int. So no to your first question.

The result of the expression would be a value of type int (32 bits). It has to be assigned to a long (64 bits) variable, so the value would be widened to a long before being assigned.

From the JLS again

The integral types are byte, short, int, and long, whose values are 8-bit, 16-bit, 32-bit and 64-bit signed two's-complement integers, respectively, and char, whose values are 16-bit unsigned integers representing UTF-16 code units (§3.1).

That's how they are stored.

like image 174
Sotirios Delimanolis Avatar answered Oct 28 '22 18:10

Sotirios Delimanolis


It is actually confusing that your m variable is of long type because an IP address is 32-bit and corresponds to an int. Your right-hand side is indeed int and only after it's fully computed is it extended to long (64-bit). Answering your questions:

  1. It doesn't. You can remove the cast.
  2. The result is actually of type int, but gets converted to long because the type of m requires it.
  3. Two's complement is not "applied" to anything, really. The number -1 is encoded in two's complement. You need some way to represent negative numbers with nothing but bits. Plus, here two's complement plays a side role: it is just about -1 being encoded as all 1-bits.
  4. It's all just a block of 32 one-bits being shifted to the left, zeroes filling in the vacancy. Then, to convert to long, 32 more 1-bits are added on the left side.
like image 31
Marko Topolnik Avatar answered Oct 28 '22 18:10

Marko Topolnik