Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does FreeBSD's implementation of memchr increment its pointer in its condition?

Tags:

c

freebsd

FreeBSD's generic implementation of memchr does:

void *
memchr(const void *s, int c, size_t n)
{
    if (n != 0) {
        const unsigned char *p = s;

        do {
            if (*p++ == (unsigned char)c)
                return ((void *)(p - 1));
        } while (--n != 0);
    }
    return (NULL);
}

which to me seems unnecessarily complicated; the initial n != 0 check with do-while just to avoid the p declaration seems completely pointless. However, I am particularly interested in why the loop body does:

if (*p++ == (unsigned char)c)
    return ((void *)(p - 1));

instead of a more straightforward:

if (*p == (unsigned char)c)
    return ((void *) p);
++p;

Does inlining the post-increment with the condition have some optimization benefit for some compiler/platform?

like image 437
jamesdlin Avatar asked Feb 28 '17 01:02

jamesdlin


2 Answers

First: This is pure speculation. I have not written this code nor could I verify my guess.

There's one very important semantic difference between those two versions of the code:

// Version A
if (*p++ == (unsigned char)c)
    return ((void *)(p - 1));

// Version B
if (*p == (unsigned char)c)
    return ((void *) p);
++p;

In version A the increment is sequenced before the code block of that if, whereas in version B it is sequenced after that block.

Thus in version A the increment code will be placed before the branch instruction which is likely generated from the if. At least we can IMO assume such a relatively direct translation from C code to assembly from compilers of the time when the code was written (1988?).

Does inlining the post-increment with the condition have some optimization benefit for some compiler/platform?

Having the increment before the branch allows for a relatively simple optimisation on architectures whose branch instructions have a delay slot: you can move the increment into that delay slot instead of having a NOP there.

So version A needs one instruction less per loop iteration than version B, at the cost of a single decrement when the function returns. It's a (micro) optimisation.

like image 51
Daniel Jour Avatar answered Oct 23 '22 19:10

Daniel Jour


The case of the pointer post-increment is to allow compilers of the era (e.g. PCC) to use the auto-increment addressing mode in DEC machines (like Mark Plotnick mentions in a comment).

Since all DEC machines support auto-increment addressing, this way of coding loops was once very common (BTW, the m68k supports the same optimization).

The do-while loop on the other hand is simply because on a naïf compiler it tends to yield better code, not just to avoid setting p.

Neither should do any difference on a modern compiler.

like image 2
Ismael Luceno Avatar answered Oct 23 '22 20:10

Ismael Luceno