Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

likely(x) and __builtin_expect((x),1)

I know the kernel uses the likely and unlikely macros prodigiously. The docs for the macros are located at Built-in Function: long __builtin_expect (long exp, long c). But they don't really discuss the details.

How exactly does a compiler handle likely(x) and __builtin_expect((x),1)?

Is it handled by the code generator or the optimizer?

Does it depend upon optimization levels?

What's an example of the code generated?

like image 291
jww Avatar asked Jun 19 '14 07:06

jww


People also ask

What is __ Builtin_expect?

You can use the __builtin_expect built-in function to indicate that an expression is likely to evaluate to a specified value. The compiler can use this knowledge to direct optimizations. This built-in function is portable with the GNU C/C++ __builtin_expect function.

When to use unlikely in C?

C++ attribute: likely, unlikely (since C++20) 1) Applies to a statement to allow the compiler to optimize for the case where paths of execution including that statement are more likely than any alternative path of execution that does not include such a statement.


1 Answers

I just tested a simple example on gcc.

For x86 this seems to be handled by the optimizer and depend on optimization levels. Although I guess a correct answer here would be "it depends on the compiler".

The code generated is CPU dependent. Some cpus (sparc64 comes immediately to my mind, but I'm sure there are others) have flags on conditional branch instructions that tell the CPU how to predict it, so the compiler generates "predict true/predict false" instructions depending on the built in rules in the compiler and hints from the code (like __builtin_expect).

Intel documents their behavior here: https://software.intel.com/en-us/articles/branch-and-loop-reorganization-to-prevent-mispredicts . In short the behavior on Intel CPUs is that if the CPU has no previous information about a branch it will predict forward branches as unlikely to be taken, while backwards branches are likely to be taken (think about loops vs. error handling).

This is some example code:

int bar(int);
int
foo(int x)
{
    if (__builtin_expect(x>10, PREDICTION))
        return bar(10);
    return 42;
}

Compiled with (I'm using omit-frame-pointer to make the output more readable, but I still cleaned it up below):

$ cc -S -fomit-frame-pointer -O0 -DPREDICTION=0 -o 00.s foo.c
$ cc -S -fomit-frame-pointer -O0 -DPREDICTION=1 -o 01.s foo.c
$ cc -S -fomit-frame-pointer -O2 -DPREDICTION=0 -o 20.s foo.c
$ cc -S -fomit-frame-pointer -O2 -DPREDICTION=1 -o 21.s foo.c

There's no difference between 00.s and 01.s, so that shows that this is dependent on optimization (for gcc at least).

Here's the (cleaned up) generated code for 20.s:

foo:
    cmpl    $10, %edi
    jg  .L2
    movl    $42, %eax
    ret
.L2:
    movl    $10, %edi
    jmp bar

And here is 21.s:

foo:
    cmpl    $10, %edi
    jle .L6
    movl    $10, %edi
    jmp bar
.L6:
    movl    $42, %eax
    ret

As expected the compiler rearranged the code so that the branch we don't expect to take is done in a forward branch.

like image 87
Art Avatar answered Sep 22 '22 21:09

Art