The strlen()
from K&R takes only a few lines.
int strlen(char *s)
{
char *p = s;
while (*p != '\0')
p++;
return p - s;
}
But the glibc version is much longer. For simplicity, I removed all the comments and the 64-bit implementation, the extracted version strlen()
looks like this:
size_t strlen(const char *str)
{
const char *char_ptr;
const unsigned long int *longword_ptr;
unsigned long int longword, magic_bits, himagic, lomagic;
for (char_ptr = str; ((unsigned long int) char_ptr
& (sizeof (longword) - 1)) != 0; ++char_ptr)
if (*char_ptr == '\0')
return char_ptr - str;
longword_ptr = (unsigned long int *) char_ptr;
himagic = 0x80808080L;
lomagic = 0x01010101L;
for (;;)
{
longword = *longword_ptr++;
if (((longword - lomagic) & himagic) != 0)
{
const char *cp = (const char *) (longword_ptr - 1);
if (cp[0] == 0)
return cp - str;
if (cp[1] == 0)
return cp - str + 1;
if (cp[2] == 0)
return cp - str + 2;
if (cp[3] == 0)
return cp - str + 3;
}
}
}
With the help of the very helpful comment (click here), I got most of how this works. Instead of checking for '\0'
byte by byte, the glibc strlen()
checks every word(4 bytes in 32-bit machine, 8 byte in 64-bit machine). In this way, when the string is relatively long, the performance can be improved.
It checks the first few characters by reading byte by byte, until char_ptr
is aligned on a longword
boundary. Then it uses a loop to check if the longword
has any bytes with of all-zeros. If there is, check which byte, and return the result.
The part I don't get is, how does this check if one byte in longword
is all-zeros?
if (((longword - lomagic) & himagic) != 0)
I can build a longword
value of 0x81818181
, which can make 0x81818181 - 0x01010101) & 0x80808080
not equal to 0
, but there are no all-zero bytes.
Is this related with the fact that ASCII values range from 0
to 127
, so 0x81
isn't valid ASCII? But I don't think the C standard forces strings to use ASCII.
I figured it out. Can't believe it's that simple, I spent the last half an hour on it.
It's OK that the check
if (((longword - lomagic) & himagic) != 0)
lets values like 0x81818181
pass, because if it passes, the following test on every byte would not return since there are no all-zero byte. So the loop can continue to test the next longword
.
The algorithm behind the check is based on Determine if a word has a zero byte
unsigned int v;
bool hasZeroByte = ~((((v & 0x7F7F7F7F) + 0x7F7F7F7F) | v) | 0x7F7F7F7F);
In 2's complement, - 0x01010101
has the same effect with + 0xFEFEFEFF
. The difference is because glibc doesn't have v & 0x7F7F7F7F
, which makes sure the bytes in the word has the most significant bit of 0
. This prevents examples like 0x81818181
, but glibc omits it because it doesn't have to let it pass as stated earlier, The check is correct as long as it won't miss any word that does have all-zero bytes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With