I'd like a shortcut for the following little function, where performance is very important (the function is called more than 10.000.000 times):
inline int len(uint32 val)
{
if(val <= 0x000000ff) return 1;
if(val <= 0x0000ffff) return 2;
if(val <= 0x00ffffff) return 3;
return 4;
}
Does anyone have any idea... a cool bitoperation trick? Thanks for your help in advance!
How about this one?
inline int len(uint32 val)
{
return 4
- ((val & 0xff000000) == 0)
- ((val & 0xffff0000) == 0)
- ((val & 0xffffff00) == 0)
;
}
Removing the inline
keyword, g++ -O2
compiles this to the following branchless code:
movl 8(%ebp), %edx
movl %edx, %eax
andl $-16777216, %eax
cmpl $1, %eax
sbbl %eax, %eax
addl $4, %eax
xorl %ecx, %ecx
testl $-65536, %edx
sete %cl
subl %ecx, %eax
andl $-256, %edx
sete %dl
movzbl %dl, %edx
subl %edx, %eax
If you don't mind machine-specific solutions, you can use the bsr
instruction which searches for the first 1 bit. Then you simply divide by 8 to convert bits to bytes and add 1 to shift the range 0..3 to 1..4:
int len(uint32 val)
{
asm("mov 8(%ebp), %eax");
asm("or $255, %eax");
asm("bsr %eax, %eax");
asm("shr $3, %eax");
asm("inc %eax");
asm("mov %eax, 8(%ebp)");
return val;
}
Note that I am not an inline assembly god, so maybe there's a better to solution to access val
instead of addressing the stack explicitly. But you should get the basic idea.
The GNU compiler also has an interesting built-in function called __builtin_clz
:
inline int len(uint32 val)
{
return ((__builtin_clz(val | 255) ^ 31) >> 3) + 1;
}
This looks much better than the inline assembly version to me :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With