Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does the order of cases in a switch statement affect performance?

I have a switch case program:

Ascending order switch cases :

int main() {         int a, sc = 1;         switch (sc)         {                 case 1:                         a = 1;                         break;                 case 2:                         a = 2;                         break;         } } 

Assembly of code:

main:         push    rbp         mov     rbp, rsp         mov     DWORD PTR [rbp-4], 1         mov     eax, DWORD PTR [rbp-4]         cmp     eax, 1         je      .L3         cmp     eax, 2         je      .L4         jmp     .L2 .L3:         mov     DWORD PTR [rbp-8], 1         jmp     .L2 .L4:         mov     DWORD PTR [rbp-8], 2         nop .L2:         mov     eax, 0         pop     rbp         ret 

Descending order switch cases:

int main() {         int a, sc = 1;         switch (sc)         {                 case 2:                         a = 1;                         break;                 case 1:                         a = 2;                         break;         } } 

Assembly of code:

main:         push    rbp         mov     rbp, rsp         mov     DWORD PTR [rbp-4], 1         mov     eax, DWORD PTR [rbp-4]         cmp     eax, 1         je      .L3         cmp     eax, 2         jne     .L2         mov     DWORD PTR [rbp-8], 1         jmp     .L2 .L3:         mov     DWORD PTR [rbp-8], 2         nop .L2:         mov     eax, 0         pop     rbp         ret 

Here, ascending order cases generated more assembly than descending order.

So, if I have more numbers of switch cases, then does the order of cases affect performance?

like image 543
msc Avatar asked Nov 03 '17 06:11

msc


People also ask

Does order matter in switch statement?

The order of cases does not matter as long as you always have the reserved word “break” at the end of all cases that shouldn't have fall-through. The result of the expression in the switch statement must have the same type and value as a case in order to execute that case's instructions.

Are switch cases faster than if-else?

A switch statement is significantly faster than an if-else ladder if there are many nested if-else's involved. This is due to the creation of a jump table for switch during compilation. As a result, instead of checking which case is satisfied throughout execution, it just decides which case must be completed.

What is the sequence of execution of switch statement?

Execution of the switch statement body begins at the first statement in or after the matching labeled-statement . Execution proceeds until the end of the body, or until a break statement transfers control out of the body.


2 Answers

You're looking at unoptimized code, so studying it for performance isn't very meaningful. If you look at optimized code for your examples, you'll find that it doesn't do the comparisons at all! The optimizer notices that the switch variable sc always has the value 1, so it removes the unreachable case 2.

The optimizer also sees that the variable a isn't used after it's assigned, so it removes the code in case 1 as well, leaving main() an empty function. And it removes the function prolog/epilog that manipulates rbp since that register is unused.

So the optimized code ends up the same for either version of your main() function:

main:     xor eax, eax     ret 

In short, for the code in the question, it doesn't matter which order you put the case statements, because none of that code will be generated at all.

Would the case order matter in a more real-life example where the code actually is generated and used? Probably not. Note that even in your unoptimized generated code, both versions test for the two case values in numeric order, checking first for 1 and then for 2, regardless of the order in the source code. Clearly the compiler is doing some sorting even in the unoptimized code.

Be sure to note Glenn and Lundin's comments: the order of the case sections is not the only change between your two examples, the actual code is different too. In one of them, the case values match the values set into a, but not so in the other.

Compilers use various strategies for switch/case statements depending on the actual values used. They may use a series of comparisons as in these examples, or perhaps a jump table. It can be interesting to study the generated code, but as always, if performance matters, watch your optimization settings and test it in a real-life situation.

like image 111
Michael Geary Avatar answered Sep 23 '22 09:09

Michael Geary


Compiler optimization of switch statements is tricky. Of course, you need to enable optimizations (e.g. try to compile your code with gcc -O2 -fverbose-asm -S with GCC and look inside the generated .s assembler file). BTW on both of your examples my GCC 7 on Debian/Sid/x86-64 gives simply:

        .type   main, @function main: .LFB0:         .cfi_startproc # rsp.c:13: }         xorl    %eax, %eax      #         ret         .cfi_endproc 

(so there in no trace of switch in that generated code)

If you need to understand how a compiler could optimize switch, there are some papers on that subject, such as this one.

If I have more numbers of switch cases, then an order of cases effect on performance?

Not in general, if you are using some optimizing compiler and asking it to optimize. See also this.

If that matters to you so much (but it should not, leave micro-optimizations to your compiler!), you need to benchmark, to profile and perhaps to study the generated assembler code. BTW, cache misses and register allocation could matter much more than order of case-s so I think you should not bother at all. Keep in mind the approximate timing estimates of recent computers. Put the cases in the most readable order (for the next developer working on that same source code). Read also about threaded code. If you have objective (performance related) reasons to re-order the case-s (which is very unlikely and should happen at most once in your lifetime), write some good comment explaining those reasons.

If you care that much about performance, be sure to benchmark and profile, and choose a good compiler and use it with relevant optimization options. Perhaps experiment several different optimization settings (and maybe several compilers). You may want to add -march=native (in addition of -O2 or -O3). You could consider compiling and linking with -flto -O2 to enable link-time optimizations, etc. You might also want profile based optimizations.

BTW, many compilers are huge free software projects (in particular GCC and Clang). If you care that much about performance, you might patch the compiler, extend it by adding some additional optimization pass (by forking the source code, by adding some plugin to GCC or some GCC MELT extensions). That requires months or years of work (notably to understand the internal representations and organization of that compiler).

(Don't forget to take development costs into account; in most cases, they cost much more)

like image 37
Basile Starynkevitch Avatar answered Sep 24 '22 09:09

Basile Starynkevitch