I'm trying to write a Scheme interpreter, but finding TCO difficult to implement. I'm not sure what properties a function must have in order for TCO to kick in.
1) A function with a self-recursive call at the end of the definition:
(define (sum-list list accum)
(if (null list) accum
(sum-list (cdr list) (+ accum 1))))
2) A function with a self-recursive call that isn't at the end of the definition:
(define (sum-list list accum)
(cond ((not (null list))
(sum-list (cdr list) (+ accum 1)))
(else accum)))
3) A function that stores the recursive call in a variable before returning it:
(define (sum-list list accum)
(let ((sum
(if (null list)
accum
(sum-list (cdr list (+ accum 1))))))
sum))
4) Mutually recursive functions:
(define (sum-list list accum)
(if (null list)
accum
(other-function (cdr list) (+ accum 1))))
(define (other-function list accum)
(sum-list list accum))
5) Simply calling another unrelated function at the end of the function definition:
(define (foo)
(bar))
6) The trickiest case I can think of is closures. What if I hold on to a reference to a variable in scope?
(define (sum-list list accum ignored)
(let ((local-var 12345))
(if (null list)
accum
(sum-list
(cdr list)
(+ accum 1)
(lambda () local-var)))))
7) Calling another function at the end of the function definition, with a self-recursive call as an argument:
(define (sum-list list)
(if (null list)
0
(+ 1 (sum-list (cdr list)))))
As I understand it, TCO implementations (in both Scheme and Common Lisp) rewrite TCO-friendly functions, so they must be able to statically detect TCO-friendly functions.
What I'd like to know is:
A function is tail-recursive if it ends by returning the value of the recursive call. Keeping the caller's frame on stack is a waste of memory because there's nothing left to do once the recursive call returns its value.
Tail call optimization happens when the compiler transforms a call immediately followed by a ret into a single jmp . This transformation saves one instruction, and more importantly it eliminates the implicit push/pop from the stack done by call and ret .
This kind of function call is called a tail call, and languages like Haskell, Scala, and Scheme can avoid keeping around unnecessary stack frames in such calls. This is called tail call optimization (TCO) or tail call elimitation.
Tail-call optimization is a part of the ES2015-ES6 specification. Supporting it isn't a NodeJS thing, it's something the V8 engine that NodeJS uses needs to support.
Take a look at the Scheme specification, all the possible tail contexts are defined there. In particular, in R6RS (the current ratified standard) you should check section §11.20 Tail calls and tail contexts:
A tail call is a procedure call that occurs in a tail context. Tail contexts are defined inductively. Note that a tail context is always determined with respect to a particular lambda expression.
The last expression within the body of a lambda expression, shown as <tail expression> below, occurs in a tail context.
(lambda <formals> <definition>* <expression>* <tail expression>)
If one of the following expressions is in a tail context, then the subexpressions shown as <tail expression> are in a tail context. These were derived from specifications of the syntax of the forms described in this chapter by replacing some occurrences of <expression> with <tail expression>. Only those rules that contain tail contexts are shown here.
(if <expression> <tail expression> <tail expression>) (if <expression> <tail expression>) (cond <cond clause>+) (cond <cond clause>* (else <tail sequence>)) (case <expression> <case clause>+) (case <expression> <case clause>* (else <tail sequence>)) (and <expression>* <tail expression>) (or <expression>* <tail expression>) (let <bindings> <tail body>) (let <variable> <bindings> <tail body>) (let* <bindings> <tail body>) (letrec* <bindings> <tail body>) (letrec <bindings> <tail body>) (let-values <mv-bindings> <tail body>) (let*-values <mv-bindings> <tail body>) (let-syntax <bindings> <tail body>) (letrec-syntax <bindings> <tail body>) (begin <tail sequence>)
A <cond clause> is
(<test> <tail sequence>)
, a <case clause> is((<datum>*) <tail sequence>)
, a <tail body> is<definition>* <tail sequence>
, and a <tail sequence> is<expression>* <tail expression>
.If a
cond
expression is in a tail context, and has a clause of the form(<expression1> => <expression2>)
then the (implied) call to the procedure that results from the evaluation of <expression2> is in a tail context. <Expression2> itself is not in a tail context.
You haven't mentioned what runtime or language this is going to interpret Scheme so it might be that this answer is a little off.
The fool proof way of implementing tail call optimization and get call/cc in the same time is to do CPS-conversion. With CPS, every call is a tail call and non-tail recursive code gets nested continuations.
TCO you'll get when you pop the stack before applying the tail call. Since every call is a tail call in CPS you know what to do.
Every of your procedures would benefit from this except 3 and 7. Those has a continuation to do when the recursive call returns.
EDIT Python:
Python doesn't have TCO so you cannot use the host language to do calls directly. You could, however, do it with trampolines.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With