Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

when to free a closure's memory in a lisp interpreter

I'm writing a simple lisp interpreter from scratch. I have a global environment that top level variables are bound in during evaluation of all the forms in a file. When all the forms in the file have been evaluated, the top level env and all of the key value data structs inside of it are freed.

When the evaluator encounters a lambda form, it creates a PROC object that contains 3 things: a list of arguments to be bound in a local frame when the procedure is applied, the body of the function, and a pointer to the environment it was created in. For example:

(lambda (x) x)

would produce something internally like:

PROC- args: x, 
      body: x, 
      env: pointer to top level env

When the PROC is applied, a new environment is created for the frame and the local bindings are staged there to allow the body to be evaluated with the appropriate bindings. This frame environment contains a pointer to its closure to allow variable lookup inside of THAT. In this case, that would be the global environment. After the PROC body is evaluated, I can free all the cells associated with it including its frame environment, and exit with no memory leaks.

My problem is with higher order functions. Consider this:

(define conser 
    (lambda (x)
        (lambda (y) (cons x y))))

A function that takes one argument and produces another function that will cons that argument to something you pass into it. So,

(define aconser (conser '(1)))

Would yield a function that cons'es '(1) to whatever is passed into it. ex:

(aconser '(2)) ; ((1) 2) 

My problem here is that aconser must retain a pointer to the environment it was created in, namely that of conser when is was produced via the invocation (conser '(1)). When aconser the PROC is applied, its frame must point to the frame of conser that existed when aconser was defined, so I can't free the frame of conser after applying it. I don't know how/the best way to both free the memory associated with a lambda frame when it is applied and also support this kind of persistent higher order function.

I can think of some solutions:

  • some type of ARC

  • copying the enclosing environment into the frame of the evaluated PROC when it is produced

This seems to be what is being implied here. So, instead of saving a pointer in the PROC object to its closure, I would... copy the closure environment and store a pointer to that directly in the cell? Would this not just be kicking the can one level deeper and result in the same problem?

  • recursively substituting the labels at read time inside of the body of the higher order function

I am worried I might be missing something very simple here, and also I am curious as to how this procedure is supported in other implementations of lisp and other languages with closures in general. I have not had much luck searching for answers because the question is very specific, perhaps even to this implementation (that I am admittedly just pulling out of my hat as a learning project) and much of what I am able to find simply explains the particulars of closures from the language being implemented's perspective, not from the language that the language is being implemented in's.

Here is a link to the relevant line in my source, if it is helpful, and I am happy to elaborate if this question is not detailed enough to describe the problem thoroughly. Thanks!

like image 363
jfo Avatar asked Nov 09 '22 14:11

jfo


1 Answers

The way this is handled usually in naive interpreters is to use a garbage-collector (GC) and allocate your activation frames in the GC'd heap. So you never explicitly free those frames, you let the GC free them when applicable.

In more sophisticated implementations, you can use a slightly different approach:

  • when a closure is created, don't store a pointer to the current environment. Instead, copy the value of those variables which are used by the closure (it's called the free variables of the lambda). and change the closure's body to use those copies rather than look in the environment for those variables. It's called closure conversion.
  • Now you can treat your environment as a normal stack, and free activation frames as soon as you exit a scope.
  • You still need a GC to decide when closures can be freed.
  • this in turn requires an "assignment conversion": copying the value of variables implies a change of semantics if those variables get modified. So to recover the original semantics, you need to look for those variables which are "copied into a closure" as well as "modified", and turn them into "reference cells" (e.g. a cons cell where you keep the value in the car), so that the copy doesn't copy the value any more, but just copies a reference to the actual place where the value is kept. [ Side note: such an implementation obviously implies that avoiding setq and using a more functional style may end up being more efficient. ]

The more sophisticated implementation also has the advantage that it can provide a safe for space semantics: a closure will only hold on to data to which it actually refers, contrary to the naive approach where closures end up referring to the whole surrounding environment and hence can prevent the GC from collecting data that is not actually referenced but just happened to be in the environment at the time it was captured by the closure.

like image 193
Stefan Avatar answered Nov 14 '22 22:11

Stefan