Very simply, what is tail-call optimization?
More specifically, what are some small code snippets where it could be applied, and where not, with an explanation of why?
Tail call optimization is the specific use of tail calls in a function or subroutine that eliminate the need for additional stack frames. Tail call optimization can be part of efficient programming and the use of the values that subroutines return to a program to achieve more agile results or use fewer resources.
Tail call optimization means that, if the last expression in a function is a call to another function, then the engine will optimize so that the call stack does not grow.
TCO (Tail Call Optimization) is the process by which a smart compiler can make a call to a function and take no additional stack space. The only situation in which this happens is if the last instruction executed in a function f is a call to a function g (Note: g can be f).
This kind of function call is called a tail call, and languages like Haskell, Scala, and Scheme can avoid keeping around unnecessary stack frames in such calls. This is called tail call optimization (TCO) or tail call elimitation.
Tail-call optimization is where you are able to avoid allocating a new stack frame for a function because the calling function will simply return the value that it gets from the called function. The most common use is tail-recursion, where a recursive function written to take advantage of tail-call optimization can use constant stack space.
Scheme is one of the few programming languages that guarantee in the spec that any implementation must provide this optimization, so here are two examples of the factorial function in Scheme:
(define (fact x) (if (= x 0) 1 (* x (fact (- x 1))))) (define (fact x) (define (fact-tail x accum) (if (= x 0) accum (fact-tail (- x 1) (* x accum)))) (fact-tail x 1))
The first function is not tail recursive because when the recursive call is made, the function needs to keep track of the multiplication it needs to do with the result after the call returns. As such, the stack looks as follows:
(fact 3) (* 3 (fact 2)) (* 3 (* 2 (fact 1))) (* 3 (* 2 (* 1 (fact 0)))) (* 3 (* 2 (* 1 1))) (* 3 (* 2 1)) (* 3 2) 6
In contrast, the stack trace for the tail recursive factorial looks as follows:
(fact 3) (fact-tail 3 1) (fact-tail 2 3) (fact-tail 1 6) (fact-tail 0 6) 6
As you can see, we only need to keep track of the same amount of data for every call to fact-tail because we are simply returning the value we get right through to the top. This means that even if I were to call (fact 1000000), I need only the same amount of space as (fact 3). This is not the case with the non-tail-recursive fact, and as such large values may cause a stack overflow.
Let's walk through a simple example: the factorial function implemented in C.
We start with the obvious recursive definition
unsigned fac(unsigned n) { if (n < 2) return 1; return n * fac(n - 1); }
A function ends with a tail call if the last operation before the function returns is another function call. If this call invokes the same function, it is tail-recursive.
Even though fac()
looks tail-recursive at first glance, it is not as what actually happens is
unsigned fac(unsigned n) { if (n < 2) return 1; unsigned acc = fac(n - 1); return n * acc; }
ie the last operation is the multiplication and not the function call.
However, it's possible to rewrite fac()
to be tail-recursive by passing the accumulated value down the call chain as an additional argument and passing only the final result up again as the return value:
unsigned fac(unsigned n) { return fac_tailrec(1, n); } unsigned fac_tailrec(unsigned acc, unsigned n) { if (n < 2) return acc; return fac_tailrec(n * acc, n - 1); }
Now, why is this useful? Because we immediately return after the tail call, we can discard the previous stackframe before invoking the function in tail position, or, in case of recursive functions, reuse the stackframe as-is.
The tail-call optimization transforms our recursive code into
unsigned fac_tailrec(unsigned acc, unsigned n) { TOP: if (n < 2) return acc; acc = n * acc; n = n - 1; goto TOP; }
This can be inlined into fac()
and we arrive at
unsigned fac(unsigned n) { unsigned acc = 1; TOP: if (n < 2) return acc; acc = n * acc; n = n - 1; goto TOP; }
which is equivalent to
unsigned fac(unsigned n) { unsigned acc = 1; for (; n > 1; --n) acc *= n; return acc; }
As we can see here, a sufficiently advanced optimizer can replace tail-recursion with iteration, which is far more efficient as you avoid function call overhead and only use a constant amount of stack space.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With