Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can someone explain ARM bitwise operations to me?

Can someone explain ARM bit-shifts to me like I'm five? I have a very poor understanding of anything that involves non-decimal number systems so understanding the concepts of bit shifts and bitwise operators is difficult for me.

What would each of the following cases do and why (what would end up in R3 and what happens on behind the scenes on the bit level)?

/** LSL **/
mov r0, #1
mov r3, r0, LSL#10

/** LSR **/
mov r0, #1
mov r3, r0, LSR#10

/** ORR **/
mov r0, #1
mov r1, #4
orr r3, r1, r0

/** AND **/
mov r0, #1
mov r1, #4
and r3, r1, r0

/** BIC **/
mov r0, #1
mov r1, #4
bic r3, r1, r0

PS. Do not explain it in terms of C bitwise operators. I don't know what they do either (the >>, <<, |, & ones).

like image 458
Kristina Brooks Avatar asked Jan 28 '12 11:01

Kristina Brooks


People also ask

What is the use of bitwise operators in real life?

Examples of uses of bitwise operations include encryption, compression, graphics, communications over ports/sockets, embedded systems programming and finite state machines. A bitwise operator works with the binary representation of a number rather than that number's value.

What is a Bitwise operator explain in detail?

Bitwise operators are characters that represent actions (bitwise operations) to be performed on single bits. They operate at the binary level and perform operations on bit patterns that involve the manipulation of individual bits.

Is Bitwise still used?

Bitwise operations are worth studying because they have many applications. It is not their main use to substitute arithmetic operations. Cryptography, computer graphics, hash functions, compression algorithms, and network protocols are just some examples where bitwise operations are extremely useful.


2 Answers

Truth tables, two inputs, the two numbers on the left and one output, the number on the right:

OR

a b  c     
0 0  0
0 1  1
1 0  1
1 1  1

the left two inputs a and b represent the four possible combinations of inputs, no more no less that is the list.

Consider a 1 to mean true and 0 to mean false. And the word OR in this case means if a OR b is true then c is true. And as you see in the table, horizontally if either a or b is true then c is true.

AND

a b  c
0 0  0
0 1  0
1 0  0
1 1  1

And means they both have to be true if a AND b are both true then c is true. There is only one case where that exists above.

Now take two bytes 0x12 and 0x34 which in decimal are 18 and 52 but we dont really care much about decimal. we care about binary 0x12 is 0b00010010 and 0x34 is 0b00110100. The bitwise operators like AND and OR and XOR in assembly language mean you take one bit from each operand and that gives the result in the same bit location. Its not like add where you have things like this plus that equals blah carry the one.

so we line up the bits

0b00010010    0x12
0b00110100    0x34

So tilt your head sidways like you are going to take a bite out of a taco held in your left hand and visualize the truth table above. If we look at the two bits on the right they are 0 and 0, the next two bits are 1 and 0 and so on. So if we wanted to do an OR operation, the rule is if either a or b is true then c, the result, is true

   0b00010010
   0b00110100
OR ==========
   0b00110110

Head tilted to the right, least significant bit (the bit in the ones column in the number) 0 or 0 = 0, neither one is set. next column (the twos column) 1 or 0 = 1 at least one is true. and so on so

0x12 OR 0x34 = 0x36

In arm assembly that would be

mov r0,#0x12
mov r1,#0x34
orr r2,r0,r1

after the or operation r2 would hold the value 0x36.

Now lets and those numbers

    0b00010010
    0b00110100
AND ==========
    0b00010000

Remembering our truth table and the rule both a and b have to be true (a 1) we tilt our head to the right, 0 and 0 is 0, both are not true. and by inspection only one column has both inputs with a 1, the 16s column. this leaves us with 0x12 AND 0x34 = 0x10

In arm assembly that would be

mov r0,#0x12
mov r1,#0x34
and r2,r0,r1

Now we get to the BIC instruction. Which stands for bitwise clear, which hopefully will make sense in a bit. Bic on the arm is a anded with not b. Not is another truth table, but only one input and one output

NOT

a  c
0  1
1  0

With only one input we have only two choices 0 and 1, 1 is true 0 is false. NOT means if not a then c is true. when a is not true c is true, when a is true c is not true. Basically it inverts.

What the bic does is have two inputs a and b, the operation is c = a AND (NOT b) so the truth table for that would be:

a AND (NOT b)

a b  c
0 1  0
0 0  0
1 1  0
1 0  1

I started with the AND truth table then then NOTted the b bits, where b was a 0 in the AND truth table I made it a 1 where b was a 1 in the AND truth table I made it a 0.

So the bic operation on 0x12 and 0x34 is

    0b00010010
    0b00110100
BIC ==========
    0b00000010

Why is it called bit clear? Understanding that makes it much easier to use. If you look at the truth table and think of the first and second inputs. Where the second, b, input is a 1 the output is 0. where the second input, b, is a 0, the output is a itself unmodified. So what that truth table or operation is doing is saying anywhere b is set clear or zero those bits in A. So if I have the number 0x1234 and I want to zero the lower 8 bits, I would BIC that with 0x00FF. And your next question is why not AND that with 0xFF00? (analyze the AND truth table and see that wherever b is a 1 you keep the a value as is, and wherever b is a 0 you zero the output). The ARM uses 32 bit registers, and a fixed 32 bit instruction set, at least traditionally. The immediate instructions

mov r0,#0x12

In arm are limited to 8 non-zero bits shifted anywhere within the number, will get to shifting in a bit. So if I had the value 0x12345678 and wanted to zero out the lower 8 bits I could do this

; assume r0 already has 0x12345678
bic r0,r0,#0xFF

or

; assume r0 already has 0x12345678
mov r1,#0xFF000000
orr r1,r1,#0x00FF0000
orr r1,r1,#0x0000FF00
;r1 now contains the value 0xFFFFFF00
and r0,r0,r1

or

; assume r0 already contains 0x12345678
ldr r1,my_byte_mask
and r0,r0,r1
my_byte_mask: .word 0xFFFFFF00

which is not horrible, compared to using a move and two orrs, but still burns more clock cycles than the bic solution because you burn the extra memory cycle reading my_byte_mask from ram, which can take a while.

or

; assume r0 already contains 0x12345678
mvn r1,#0xFF
and r0,r0,r1

This last one being not a bad compromize. note that mvn in the arm documentation is bitwise not immediate, that means rx = NOT(immediate). The immediate here is 0xFF. NOT(0xFF) means invert all the bits, it is a 32 bit register we are going to so that means 0xFFFFFF00 is the result of NOT(0xFF) and that is what the register r1 gets, before doing the and.

So that is why bic has a place in the ARM instruction set, because sometimes it takes fewer instructions or clock cycles to mask (mask = AND used to make some bits zeros) using the bic instruction instead of the and instruction.

I used the word mask as a concept to make bits in a number zero leaving the others alone. orring can be thought of as making bits in a number one while leaving the others alone, if you look at the OR truth table any time b is a 1 then c is a 1. So 0x12345678 OR 0x000000FF results in 0x123456FF the bits in the second operand are set. Yes it is also true that anytime a is set in the OR truth table then the output is set, but a lot of the time when you use these bitwise operations you have one operand you want to do something to, set a certain number of bits to one without modifying the rest or set a certain number of bits to zero without modifying the rest or you want to zero all the bits except for a certain number of bits. When used that way you have one operand coming in which is what you want to operate on and you create the second operand based on what you want the overall effect to be, for example in C if we wanted to keep only the lower byte we could have a one parameter in, one parameter out function:

unsigned int keep_lower_byte ( unsigned int a )
{
    return(a&(~0xFF));
}

~ means NOT so ~0xFF, for 32 bit numbers means 0xFFFFFF00 then & means AND, so we return a & 0xFFFFFF00. a was the only real operand coming in and we invented the second one based on the operation we wanted to do...Most bitwise operations you can swap the operands in the instruction and everything turns out okay, instructions like ARM's bic though the operands are in a certain order, just like a subtract you have to use the correct order of operands.

Shifting...there are two kinds, logical, and arithmetic. logical is easiest and is what you get when you use >> or << in C.

Start with 0x12 which is 0b00010010. Shifting that three locations to the left (0x12<<3) means

00010010 < our original number 0x12
0010010x < shift left one bit location
010010xx < shift left another bit location
10010xxx < shift left a third bit location

What bits get "shifted in" to the empty locations, the x'es above, varies based on the operation. For C programming it is always zeros:

00010010 < our original number 0x12
00100100 < shift left one bit location
01001000 < shift left another bit location
10010000 < shift left a third bit location

But sometimes (usually every instruction set supports a rotate as well as a shift) there are other ways to shift and the differences have to do with what bit you shift into the empty spot, and also sometimes the bit you shifted off the end doesnt always just disappear sometimes you save that in a special bit holder location.

Some instruction sets only have a single bit shift meaning for each instruction you program you can only shift one bit, so the above would be 3 instructions, one bit at a time. Other instruction sets, like arm, allow you to have a single instruction and you specify in the instruction how many bits you want to shift in that direction. so a shift left of three

mov r0,#0x12
mov r3,r0,lsl#3 ; shift the contents of r0 3 bits to the left and store in r3

This varying of what you shift in is demonstrated between lsr and asr, logical shift right and arithmetic shift right (you will see that there is no asl, arithmetic shift left because that makes no sense, some assemblers will allow you to use an asl instruction but encode it as a lsl).

A LOGICAL shift right:

00010010 - our original number 0x12
x0001001 - shifted right one bit
xx000100 - shifted right another bit
xxx00010 - shifted right another bit

As with C there is a version that shifts in zeros, that is the logical shift right, shifting in zeros

00010010 - our original number 0x12
00001001 - shifted right one bit
00000100 - shifted right another bit
00000010 - shifted right another bit

ARITHMETIC shift right means preserve the "sign bit" what is the sign bit? that gets into twos complement numbers which you also need to learn if you have not. Basically if you consider the bit pattern/value to be a twos complement number then the most significant bit, the one on the left, is the sign bit. if it is 0 the number is positive and 1 the number is negative. You may have noticed that a shift left by one bit is the same as multiplying by 2 and a shift right is the same as dividing by 2. 0x12 >> 1 = 0x9, 18 >> 1 = 9 but what if we were to shift a minus 2 to the right one, minus two is 0xFE using bytes or 0b11111110. using the C style logical shift right 0xFE >> 1 = 0x7F, or in decimal -2 >> 1 = 0x127. We cannot solve that in C in a single operation, unfortunately, but in assembly we can using an arithmetic shift, assuming your instruction set has one, which the arm does

ARITHMETIC shift right

s1100100 - our starting value s is the sign bit whatever that is 0 or 1
ss110010 - one shift right
sss11001 - another shift right
ssss1100 - another shift right

So if the sign bit s was a 0 when we started, if the number was 01100100 then

01100100 - our starting value
00110010 - one shift right
00011001 - another shift right
00001100 - another shift right

but if that sign bit had been a one

11100100 - our starting value
11110010 - one shift right
11111001 - another shift right
11111100 - another shift right

And we can solve the 0xFE shifted right one:

11111110 - 0xFE a minus 2 in twos complement for a byte
11111111 - shifted right one

so in pseudo code 0xFE ASR 1 = 0xFF, -2 ASR 1 = -1. -2 divided by 2 = -1

The last thing you need to read up on your own has to do with rotates and/or what happens to the bit that is shifted off the end. a shift right the lsbit is shifted "off the end" of the number like blocks being slid of a table and the one that falls off might just go into the "bit bucket" (ether, heaven or hell, one of these places where bits go to die when they disappear from this world). But some instructions in some instruction sets will take that bit being shifted off and put it in the Carry flag (read up on add and subtract), not because it is a carry necessarily but because there are status bits in the alu and the Carry bit is one that kinda makes sense. Now what a rotate is, is lets say you had an 8 bit processor and you rotated one bit, the bit falling off the end lands in the Carry bit, AND the bit shifting in the other side is what was in the carry bit before the operation. Basically it is musical chairs, the bits are walking around the chairs with one person left standing, the person standing is the carry bit, the people in chairs are the bits in the register. Why is this useful at all? lets say we had an 8 bit processor like the Atmel AVR for example but wanted to do a 64 bit shift. 64 bits takes 8, 8 bit, registers, say I have my 64 bit number in those 8 registers and I want to do a 64 bit shift left one bit. I would start with the least significant byte and do an lsl which shifts a zero in but the bit shifting out goes into the carry bit. then the next most significant byte I do a rol, rotate left one bit, the bit coming in is the bit going out of the prior byte and the bit going out goes to the carry bit. I repeat the rol instruction for the other bytes, looking at a 16 bit shift:

00100010 z0001000 - our original number
00100010 z 0001000 - lsl the least significant byte, the ms bit z is in carry
0100010z 00010000 - rotate left the most significant byte pulling the z bit from carry

00100010z0001000 - if it had been a 16 bit register
0100010z00010000 - a logical shift left on a 16 bit with a zero coming in on the left

that is what the rotates are for and that is why the assembly manual bothers to tell you what flags are modified when you perform a logical operation.

like image 66
old_timer Avatar answered Oct 07 '22 06:10

old_timer


I'll do the first one and then maybe you can try and work out the rest using a similar approach:

/** LSL **/
mov r0, #1            ; r0 = 0000 0000 0000 0000 0000 0000 0000 0001
mov r3, r0, LSL#10    ; r3 = r0 logically shifted left by 10 bit positions 
                           = 0000 0000 0000 0000 0000 0100 0000 0000
                                                       ^           ^
                                                       +<<<<<<<<<<<+
                                                     shift left 10 bits

Note however that if you don't yet understand boolean operations such as OR (|), AND (&), etc, then you will have a hard time understanding the corresponding ARM instructions (ORR, AND, etc).

like image 27
Paul R Avatar answered Oct 07 '22 08:10

Paul R