Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why C compilers optimize switch and if differently

I was working on a personal project recently when I stumbled across an odd issue.

In a very tight loop I have an integer with a value between 0 and 15. I need to get -1 for values 0, 1, 8, and 9 and 1 for values 4, 5, 12, and 13.

I turned to godbolt to check a few options and was surprised that it seemed the compiler couldn't optimize a switch statement the same way as an if chain.

The link is here: https://godbolt.org/z/WYVBFl

The code is:

const int lookup[16] = {-1, -1, 0, 0, 1, 1, 0, 0, -1, -1, 0, 0, 1, 1, 0, 0};

int a(int num) {
    return lookup[num & 0xF];
}

int b(int num) {
    num &= 0xF;

    if (num == 0 || num == 1 || num == 8 || num == 9) 
        return -1;

    if (num == 4 || num == 5 || num == 12 || num == 13)
        return 1;

    return 0;
}

int c(int num) {
    num &= 0xF;
    switch (num) {
        case 0: case 1: case 8: case 9: 
            return -1;
        case 4: case 5: case 12: case 13:
            return 1;
        default:
            return 0;
    }
}

I would have thought that b and c would yield the same results, and I was hoping that I could read the bit-hacks to come up with an efficient implementation myself since my solution (the switch statement - in another form) was fairly slow.

Oddly, b compiled to bit-hacks while c was either pretty much un-optimized or reduced to a different case of a depending on target hardware.

Can anybody explain why there is this discrepancy? What is the 'correct' way to optimize this query?

EDIT:

Clarification

I want the switch solution to be the fastest, or a similarly "clean" solution. However when compiled with optimizations on my machine the if solution is significantly faster.

I wrote a quick program to demonstrate and TIO has the same results as I find locally: Try it online!

With static inline the lookup table speeds up a bit: Try it online!

like image 292
LambdaBeta Avatar asked Oct 10 '19 22:10

LambdaBeta


2 Answers

If you explicitely enumerate all the cases, gcc is very efficient :

int c(int num) {
    num &= 0xF;
    switch (num) {
        case 0: case 1: case 8: case 9: 
            return -1;
        case 4: case 5: case 12: case 13:
            return 1;
            case 2: case 3: case 6: case 7: case 10: case 11: case 14: case 15: 
        //default:
            return 0;
    }
}

is just compiled in a simple indexed branch :

c:
        and     edi, 15
        jmp     [QWORD PTR .L10[0+rdi*8]]
.L10:
        .quad   .L12
        .quad   .L12
        .quad   .L9
        .quad   .L9
        .quad   .L11
        .quad   .L11
        .quad   .L9
        .quad   .L9
        .quad   .L12
etc...

Note that if default: is uncommented, gcc turns back to its nested branch version.

like image 181
Alain Merigot Avatar answered Sep 19 '22 14:09

Alain Merigot


C compilers have special cases for switch, because they expect programmers to understand the idiom of switch and exploit it.

Code like:

if (num == 0 || num == 1 || num == 8 || num == 9) 
    return -1;

if (num == 4 || num == 5 || num == 12 || num == 13)
    return 1;

would not pass review by competent C coders; three or four reviewers would simultaneously exclaim "this should be a switch!"

It's not worth it for C compilers to analyze the structure of if statements for conversion to a jump table. The conditions for that have to be just right, and the amount of variation that is possible in a bunch of if statements is astronomical. The analysis is both complicated and likely to come up negative (as in: "no, we can't convert these ifs to a switch").

like image 30
Kaz Avatar answered Sep 21 '22 14:09

Kaz