Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why use only the lower five bits of the shift operand when shifting a 32-bit value? (e.g. (UInt32)1 << 33 == 2)

Tags:

c#

bit-shift

Consider the following code:

UInt32 val = 1;
UInt32 shift31 = val << 31;                    // shift31  == 0x80000000
UInt32 shift32 = val << 32;                    // shift32  == 0x00000001
UInt32 shift33 = val << 33;                    // shift33  == 0x00000002
UInt32 shift33a = (UInt32)((UInt64)val << 33); // shift33a == 0x00000000

It doesn't generate a warning (about using a shift greater than 32) so it must be an expected behavior.

The code that actually gets put out to the generated assembly (or at least Reflector's interpretation of the code) is

 uint val = 1;
 uint shift31 = val << 0x1f;
 uint shift32 = val;
 uint shift33 = val << 1;
 uint shift33a = val << 0x21;  

The IL (again, using Reflector) is

L_0000: nop 
L_0001: ldc.i4.1 
L_0002: stloc.0 
L_0003: ldloc.0 
L_0004: ldc.i4.s 0x1f
L_0006: shl 
L_0007: stloc.1 
L_0008: ldloc.0 
L_0009: stloc.2 
L_000a: ldloc.0 
L_000b: ldc.i4.1 
L_000c: shl 
L_000d: stloc.3 
L_000e: ldloc.0 
L_000f: conv.u8 
L_0010: ldc.i4.s 0x21
L_0012: shl 
L_0013: conv.u4 
L_0014: stloc.s shift33a

I understand what is going on (it's described in MSDN); when the code is compiled, only the lower 5 bits are being used when shifting a 32-bit value... I'm curious as to why this happens.

(The way shift33a comes out also makes me think that something isn't quite right with Reflector, as their c# presentation of the IL will compile to something different)

The question(s):

  • Why are only the lower 5 bits of "the value to shift by" used?
  • If "it doesn't make sense to shift more than 31 bits", why isn't there a warning?
  • Is this a backwards compatilbility thing (i.e. is this what programmers "expect" to happen)?
  • Am I correct that the underlying IL can do shifts of more than 31 bits (as in L_0010: ldc.i4.s 0x21) but the compiler is trimming the values?
like image 533
Daniel LeCheminant Avatar asked Mar 13 '09 21:03

Daniel LeCheminant


People also ask

What does the<< operator do?

Description. This operator shifts the first operand the specified number of bits to the left. Excess bits shifted off to the left are discarded. Zero bits are shifted in from the right.

What happens in left shift?

The left shift operator is a logical bitwise operator. It is a binary operator that operates on two positive integral operands. It shifts the bits to the left by the number of positions specified by its second operand. Empty spaces created in the right are filled with zeroes.

What does>> mean in C?

Bitwise Right shift operator >> is used to shift the binary sequence to right side by specified position.

What is bitwise left shift?

The bitwise shift operators move the bit values of a binary object. The left operand specifies the value to be shifted. The right operand specifies the number of positions that the bits in the value are to be shifted.


1 Answers

It basically boils down to the way the x86 handles the arithmetic shift opcodes: it only uses the bottom 5 bits of the shift count. See the 80386 programming guide, for example. In C/C++, it's technically undefined behavior to do a bit shift by more than 31 bits (for a 32-bit integer), going with the C philosophy of "you don't pay for what you don't need". From section 6.5.7, paragraph 3 of the C99 standard:

The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.

This allows compilers to omit a single shift instruction on x86 for shifts. 64-bit shifts cannot be done in one instruction on x86. They use the SHLD/SHRD instructions plus some additional logic. On x86_64, 64-bit shifts can be done in one instruction.

For example, gcc 3.4.4 emits the following assembly for a 64-bit left-shift by an arbitrary amount (compiled with -O3 -fomit-frame-pointer):

uint64_t lshift(uint64_t x, int r)
{
  return x << r;
}

_lshift:
    movl    12(%esp), %ecx
    movl    4(%esp), %eax
    movl    8(%esp), %edx
    shldl   %cl,%eax, %edx
    sall    %cl, %eax
    testb   $32, %cl
    je      L5
    movl    %eax, %edx
    xorl    %eax, %eax
L5:
    ret

Now, I'm not very familiar with C#, but I'm guessing it has a similar philosophy -- design the language to allow it to be implemented as efficiently as possible. By specifying that shift operations only use the bottom 5/6 bits of the shift count, it permits the JIT compiler to compile the shifts as optimally as possible. 32-bit shifts, as well as 64-bit shifts on 64-bit systems, can get JIT compiled into a single opcode.

If C# were ported to a platform that had different behavior for its native shift opcodes, then this would actually incur an extra performance hit -- the JIT compiler would have to ensure that the standard is respected, so it would have to add extra logic to make sure only the bottom 5/6 bits of the shift count were used.

like image 118
Adam Rosenfield Avatar answered Oct 17 '22 08:10

Adam Rosenfield