I want to better understand how environments, closures, and frames are related. I understand function closures contain an environment, environments contain a frame and an enclosure, and frames contain variables, but I'm a bit fuzzy on how they interact with one another.
Perhaps an example of what's going on during a function call would help? Or maybe a diagram?
The enclosing environment is the environment where the function was created. Every function has one and only one enclosing environment. For the three other types of environment, there may be 0, 1, or many environments associated with each function: Binding a function to a name with <- defines a binding environment.
Environment Diagrams are a visual tool to keep track of bindings and state of a computer program. In this class, we use Python as our primary language, but the diagrams we teach can be applied to similar languages.
The parent frame of a function evaluation is the environment in which the function was called. It is not necessarily numbered one less than the frame number of the current evaluation, nor is it the environment within which the function was defined.
UPDATE R-lang defines an environment
as having a frame. I tend to think about frames as stack frames, not as mapping from name to value - but then there is of course the data.frame
which maps column names to vectors (and then some...). I think most of the confusion comes from the fact that the original S-language (and still S-Plus) did not have environment objects, so all "frames" were essentially what environment objects are now, except that they could only exists as part of the call stack.
For instance, in S-Plus the doc for sys.nframe
says "sys.nframe returns the numerical index of the current frame in the list of all frames." ...that sounds an awful lot like stack frames to me... You can read more about stack frames here: http://en.wikipedia.org/wiki/Call_stack#Structure
I expanded some of the explanations below and use the term "stack frame" consistently (I hope).
END UPDATE
I'd explain them like this:
An environment is an object that maps variable names to values. Each mapping is called a binding. The value can be either a real value or a promise. An environment has a parent environment (except for the empty environment). When you look up a symbol in an environment and it isn't found, the parent environments are also searched.
A promise is an unevaluated expression and an environment in which to evaluate the expression. When the promise is evaluated it is replaced with the generated value.
A closure is a function and the environment that the function was defined in. A function like lm
would have the stats namespace environment and a user defined function would have the global environment - but a function f
defined within another function g
would have the local environment for g
as its environment.
A stack frame (or activation record) is what represents the entries on the call stack. Each stack frame has the local environment that the function is executed in, and the function call's expression (so that sys.call
works).
When a function call is executed, a local environment is created with it's parent set to the closure's environment, the arguments are matched against the function's formal arguments and those bindings are added to the local environment (as promises). The unmatched formal arguments are assigned the default values (promises) of the function (if any) and marked as missing. A stack frame is then created with this local environment and the call expression. The stack frame is pushed on the call stack and then the body of the function is evaluated in this local environment.
...so all symbols in the body will be looked up in the local environment (formal arguments and local variables), and if not found in the parent environment (which is the closure enviroment) and the parent's parent environment and so on until found.
Note that the parent stack frame's environment is NOT searched in this case. The parent.frame
, sys.frame
functions gets the environments on the call stack - that is, the caller's environment and the caller's caller's environment etc...
# Here match.fun needs to look in the caller's caller's environment to find what "x" is...
f <- function(FUN) match.fun(FUN)(1:10)
g <- function() { x=sin; y="x"; f(y) }
g() # same as sin(1:10)
# Here we see that the stack frames must also contain the actual call expression
f <- function(...) sys.call()
g <- function(...) f(..., x=42)
g(a=2) # f(..., x = 42)
Does this extended description by John Fox address your questions?
It has nice diagrams, but no ponies.
There is also a description in the Fox & Weisberg book, "An R Companion to Applied Regression" (2011), starting on p. 417, or section 8.9.1. I think the above PDF, though older, is probably just as informative, if not more so (because of the diagrams). F&W is a good book, which I've plugged a couple of times before, for other useful stuff. FWIW, I didn't find any useful insights in the "R in a Nutshell" book. I don't yet have any of Chambers' books, though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With