I was curious as to the speed on switch
, believing that it was "very" fast, but I have a test case that seems to show a single switch is about as fast as about 4 if
tests, when I expected (no solid reason) it to be as fast as 1 test. Here's the two methods I wrote to compare switch
with if
:
public static int useSwitch(int i) {
switch (i) {
case 1: return 2;
case 2: return 3;
case 3: return 4;
case 4: return 5;
case 5: return 6;
case 6: return 7;
default: return 0;
}
}
public static int useIf(int i) {
if (i == 1) return 2;
if (i == 2) return 3;
if (i == 3) return 4;
if (i == 4) return 5;
if (i == 5) return 6;
if (i == 6) return 7;
return 0;
}
Here's my testing code:
long x = 0;
for (int i = 0; i < 999999; i++)
x += useIf(i % 7); // I use "x" so calls won't get optimized out
And another identical loop for useSwitch()
On my machine these loops take about the same time to complete, which was a surprise.
I arrived at the number of ifs as "4", because that's the average given the input range (I think).
If I reduce the number of logic options, the if
version is noticeably faster.
My question is:
Is it that switch
actually isn't that fast, or is this an "unfair" test in some way?
This is an unfair comparison to some extent. Most of the CPU time will be spent processing the modulo operation: i % 7
. Modulo even on latest greatest CPUs is extremely slow and probably takes 20x longer to execute than either the if() or switch() implementations you're trying to benchmark.
Furthermore, there are two different ways that switch statements can be optimized. One is by lookup table, which is used in sequential cases like the one you've put forth. The other way is by using search partition trees, where appropriate. This happens when the cases are sparse, such as in the following example:
switch (someInt) {
case 0: ... break;
case 10: ... break;
case 102: ... break;
case 6543: ... break;
case 19303: ... break;
case 19305: ... break;
// and so forth...
}
Most compilers will use an unrolled partition tree to find the correct case, which on long switches gives very good avg and worst-case jumps needed to hit the right one. The resulting pseudo-code would be something like this:
if (someInt >= 6543) {
if (someInt >= 19303) {
// continue tree search, etc.
}
else if (someInt==6543) {}
}
else if (someInt >= 0) {
if (someInt >= 10) {
// continue tree search, etc.
}
else if (someInt == 0) {}
}
else {
// default case handler...
}
It doesn't really help much with just 6-8 cases as shown here, but if you had a switch with perhaps 50+ cases then it can be really helpful. A linear search would have O(n) (50 conditions worst, 25 avg), while the partition tree version would be close to sqrt(n) (8-9 conditions worst, 5-7 avg, depending on compiler choices).
Switch is supposed to be faster, it's enough to look at the bytecode
TABLESWITCH
1: L1
2: L2
3: L3
4: L4
5: L5
6: L6
to see that it is a special operation. In real life there may be no difference because of JVM optimizations. But if we run your code with -Xint (interepret only mode) then the difference should be evident, on my PC it is 63 to 93 (ms) in favour of switch.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With