FreeBSD's generic implementation of memchr
does:
void *
memchr(const void *s, int c, size_t n)
{
if (n != 0) {
const unsigned char *p = s;
do {
if (*p++ == (unsigned char)c)
return ((void *)(p - 1));
} while (--n != 0);
}
return (NULL);
}
which to me seems unnecessarily complicated; the initial n != 0
check with do
-while
just to avoid the p
declaration seems completely pointless. However, I am particularly interested in why the loop body does:
if (*p++ == (unsigned char)c)
return ((void *)(p - 1));
instead of a more straightforward:
if (*p == (unsigned char)c)
return ((void *) p);
++p;
Does inlining the post-increment with the condition have some optimization benefit for some compiler/platform?
First: This is pure speculation. I have not written this code nor could I verify my guess.
There's one very important semantic difference between those two versions of the code:
// Version A
if (*p++ == (unsigned char)c)
return ((void *)(p - 1));
// Version B
if (*p == (unsigned char)c)
return ((void *) p);
++p;
In version A the increment is sequenced before the code block of that if
, whereas in version B it is sequenced after that block.
Thus in version A the increment code will be placed before the branch instruction which is likely generated from the if
. At least we can IMO assume such a relatively direct translation from C code to assembly from compilers of the time when the code was written (1988?).
Does inlining the post-increment with the condition have some optimization benefit for some compiler/platform?
Having the increment before the branch allows for a relatively simple optimisation on architectures whose branch instructions have a delay slot: you can move the increment into that delay slot instead of having a NOP there.
So version A needs one instruction less per loop iteration than version B, at the cost of a single decrement when the function returns. It's a (micro) optimisation.
The case of the pointer post-increment is to allow compilers of the era (e.g. PCC) to use the auto-increment addressing mode in DEC machines (like Mark Plotnick mentions in a comment).
Since all DEC machines support auto-increment addressing, this way of coding loops was once very common (BTW, the m68k supports the same optimization).
The do-while
loop on the other hand is simply because on a naïf compiler it tends to yield better code, not just to avoid setting p
.
Neither should do any difference on a modern compiler.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With