gcc
(I tried 4.7.2
on Mac and Linux with -O3
flag) optimizes ackermann function to a single call with a big local stack. An example Ackermann code below:
int ack(int m,int n){
if(m == 0) return n+1;
if(n == 0) return ack(m-1,1);
return ack(m-1,ack(m,n-1));
}
When disassembled, there is only one recursive call to ack
function, instead of two calls (I couldn't parse what is going on -ack
is now transformed by gcc into a function with 8 arguments, and local stack of 49 int and 9 char). I tried to look up what kind of transformation passes gcc did to optimize Ackermann function to a single call, but didn't find anything of interest. I will appreciate pointers on what major transformation passes gcc performed to convert the deeply recursive Ackermann to a single recursive call. LLVM gcc (I tried v4.2 on mac) doesn't reduce it to single recursive call yet, and is 4x slower with -O3
flag. This optimization seems very interesting.
The first pass is tail-call elimination. GCC does this at most optimization levels. Essentially, all function calls in tail position are transformed into goto's, like this:
int ack(int m, int n) {
begin:
if (m == 0) return n + 1;
if (n == 0) { m -= 1; n = 1; goto begin; }
n = ack(m, n-1); m -= 1; goto begin;
}
Now there is only one recursive call remaining and GCC, at -O3 level only, inlines this for a couple of iterations. Resulting in the huge monster you saw.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With