Consider a simple factorial function:
static int factorial(int n) {
if (n <= 0) return 1;
return n * factorial(n - 1);
}
int main(int argc, char** argv) {
return factorial(argc);
}
Compiling with -O2
yields a very interesting difference:
See the comparison here (Compiler explorer)
Building locally and comparing runtimes, the simple g++ binary definitely runs faster for all values within reason (i.e. that don't cause overflow) on Ubuntu 17.10.
Can anyone tell me why clang is going to all this trouble, and what it's trying to do (and failing in both size and speed)?
Can anyone tell me why clang is going to all this trouble, and what it's trying to do (and failing in both size and speed)?
It's trying to minimise the number of test-and-branch operations by vectorising the code.
It's certainly failing on size. As for whether it's failing on speed, have you bench-marked it?
gcc will do the same if you add the command line option -ftree-vectorize
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With