I have a byte-sized number in register AL (8 bits), in 32-bit x86.
I need to exchange between bit 1 (the second from right) and bit 4 (the fifth from right) of the number in register AL.
For example
00010000B input
00000010B output
One instruction answer (needs 256-byte LUT).
xlatb // needs ds:(e)bx to point to the LUT
One to two instructions answer.
(If you need to zero-extend al
, prefer movzx
to a different register if you can trash a temp reg.)
movzx eax, al // mask off extra bits -- perhaps not needed,
mov al, look_up_table[eax] // if upper bits are guaranteed to be zero
Three instructions using al
alone.
test al, 0x12
jpe skip ;; parity was even (aka the bits are the same)
xor al, 0x12 ;; toggle both bits
skip:
Theory of operation: bits need to be exchanged only when they differ. 0 changes to 1 and 1 changes to 0 by xoring them with 1. Both bits are affected at the same time.
The jump could be avoided using a cmovpo
or cmovpe
conditional move if you can assume a P6-compatible CPU like all modern x86. But in this case the sequence requires 4 instructions (or maybe only 3 if some register is known to contain a zero or the bit mask). cmov
isn't available in 8-bit operand-size, so we use movzx
to avoid partial-register problems and enable mov-elimination on Intel CPUs.
movzx edx, al
xor edx, 0x12 ; toggle the copy
test al, 0x12 ; test the original
cmovpo eax, edx ; use the flipped version if exactly 1 of 2 bits were set
PF is only set according to the low 8 bits of the result, so unless you want to horizontally XOR a wider integer down to 8 bits starting with something like mov edx, eax
/ shr edx, 16
/ xor edx, eax
, you can't use PF this way.
Alternatively one could opt for testing if a & mask
has only one bit set. That is done with expression (a== (a&-a))
(like BMI1 blsi
), or with (a & (a-1)) == 0
(like BMI1 blsr
). But both of those are true for a==0
as well as for a
having exactly 1 bit set.
If you actually have BMI1 blsr
, it sets CF if the input was zero, letting you distinguish zero bits set from 1 bit set.
pext rcx, rbx, rax ; rax = input, rbx = mask with 2 bits
xor rbx, rax ; toggled version (destroys mask)
test ecx, ecx ; compute parity flag from the 8 bottom bits
cmovpo rax, rbx ; use toggled version if only 1 bit was set
Here pext
will extract or compress the two bits pointed by the mask rbx
from the source eax
to the (two) least significant bits in the destination register rcx
. Since now the bits to be tested are in the proper area, they can be tested to set/clear PF.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With