I have a few questions about the different environments of a function. Take the following example:
environment(sd)
# <environment: namespace:stats>
Does namespace:stats point to the enclosing environment of function sd?
pryr::where(sd)
# <environment: package:stats>
Does package:stats point to the binding environment of function sd?
According to Advanced R by Hadley Wickham: "The enclosing environment belongs to the function, and never changes..."
But the enclosing environment of function can be changed like the below:
new.env <- new.env()
environment(f) <- new.env
A function' environment property indicates a function's executing environment, correct? An online article regarding R finding stuff through environments
To sum up my questions:
stats
package?It's similar to a previous post in here.
TLDR:
Function environments
You have to distinguish 4 different environments when talking about a function:
find()
gives you the binding environment.environment()
gives you the enclosing environment.Why does this matter
Every environment has a specific function:
emptyenv()
.You can change the enclosing environment
Indeed, you can change the enclosing environment. It is the enclosing environment of a function from a package you cannot change. In that case you don't change the enclosing environment, you actually create a copy in the new environment:
> ls()
character(0)
> environment(sd)
<environment: namespace:stats>
> environment(sd) <- globalenv()
> environment(sd)
<environment: R_GlobalEnv>
> ls()
[1] "sd"
> find("sd")
[1] ".GlobalEnv" "package:stats" # two functions sd now
> rm(sd)
> environment(sd)
<environment: namespace:stats>
In this case, the second sd
has the global environment as the enclosing and binding environment, but the original sd
is still found inside the package environment, and its enclosing environment is still the namespace of that package
The confusion might arise when you do the following:
> f <- sd
> environment(f)
<environment: namespace:stats>
> find("f")
[1] ".GlobalEnv"
What happens here? The enclosing environment is still the namespace ''stats''. That's where the function is created. However, the binding environment is now the global environment. That's where the name "f" is bound to the object.
We can change the enclosing environment to a new environment e
. If you check now, the enclosing environment becomes e
, but e
itself is empty. f
is still bound in the global environment.
> e <- new.env()
> e
<environment: 0x000000001852e0a8>
> environment(f) <- e
> find("f")
[1] ".GlobalEnv"
> environment(f)
<environment: 0x000000001852e0a8>
> ls(e)
character(0)
The enclosing environment of e
is the global environment. So f
still works as if its enclosure was the global environment. The environment e
is enclosed in it, so if something isn't found in e
, the function looks in the global environment and so on.
But because e
is an environment, R calls that a parent environment.
> parent.env(e)
<environment: R_GlobalEnv>
> f(1:3)
[1] 1
Namespaces and package environments
This principle is also the "trick" packages use:
The reason for this is simple: objects can only be found inside the environment you are in, or in its enclosing environments.
An illustration:
Now suppose you make an environment with the empty environment as a parent. If you use this as an enclosing environment for a function, nothing works any longer. Because now you circumvent all the package environments, so you can't find a single function any more.
> orphan <- new.env(parent = emptyenv())
> environment(f) <- orphan
> f(1:3)
Error in sqrt(var(if (is.vector(x) || is.factor(x)) x else as.double(x), :
could not find function "sqrt"
The parent frame
This is where it gets interesting. The parent frame or calling environment, is the environment where the values passed as arguments are looked up. But that parent frame can be the local environment of another function. In this case R looks first in that local environment of that other function, and then in the enclosing environment of the calling function, and so all the way up to the global environment, the environments of the attached packages until it reaches the empty environment. That's where the "object not found" bug sleeps.
environment(function)
gives the function's enclosing environment (i.e. the closure) which is assigned a pointer to the environment in which the function was defined. This convention is called lexical scoping, and is what lets you use patterns like factory functions. Here is a simple example
factory <- function(){
# get a reference to the current environment -- i.e. the environment
# that was created when the function `factory` was called.
envir = environment()
data <- 0
add <- function(x=1){
# we can use the lexical scoping assignment operator to re-assign the value of data
data <<- data + x
# return the value of the lexically scoped variable `data`
return(data)
}
return(list(envir=envir,add=add))
}
L = factory()
# check that the environment for L$add is the environment in which it was created
identical(L$envir,environment(L$add))
#> TRUE
L$add()
#> 1
L$add(3)
#> 4
note that we can re-assign the value of data
in the enclosing environment using assign()
like so:
assign("data",100,L$envir)
L$add()
#> 101
Also, when we call the function factory()
again, another new environment is created
and is assigned as the closure for the functions that get defined in that
function call, which is what allows us to have to separate foo$add() funcitons
wich scope to their own separate environments:
M = factory()
M$add()
#> 1
#> 2
L$add()
#> 102
The above factory function illustrates the link between the function and it's enclosing environment via continuation of the search for a variable (and use of the scoping assignment operator, whereas the following illustrates the link between a local environment and the calling frame via Promises which is how R passes variables in a function call.
Specifically, when you call a function, R creates promises for the value of variables and expressions passed. These value of the Promise is passed (copied) from the variable / expression by evaluating the Promise in the context of the calling environment when the parameter is force()
'd or used -- and not sooner!
For example, This factory function takes a parameter which is stored as a promise until the returned function is called:
factory2 <- function(x){
out <-function(){
return(x)
}
return(out)
}
Now factory2
behaves intuitively in some cases:
y = 1
f = factory2(y)
f()
#> 1
but not in others:
y = 1
h = factory2(y)
y = 2
h()
#> 2
because the promise for the expression y
is not evaluated until h()
is called, and in the second example, the value of y
is 2! Of course, now that the value has been copied from the calling environment into the local environment via Promise evaluation, changing the value of y won't affect the value returned by h()
:
y = 3
h()
#> 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With