I am writing some functions for doing repeated tasks, but I am trying to minimize the amount of times I load the data. Basically I have one function that takes some information and makes a plot. Then I have a second function that will loop through and output multiple plots to a .pdf. In both functions I have the following line of code:
if(load.dat) load("myworkspace.RData")
where load.dat
is a logical and the data I need is stored in myworkspace.RData. When I am calling the wrapper function that loops through and outputs multiple plots I do not want to reload the workspace in every call to the inner function. I thought I could just load the workspace once in the wrapper function, then the inner function could access that data, but I got an error stating otherwise.
So my understanding was when a function cannot find the variable in its local environment (created when the function gets called), the function will look to the parent environment for the variable.
I assumed the parent environment to the inner function call would be the outer function call. Obviously this is not true:
func1 <- function(...){
print(var1)
}
func2 <- function(...){
var1 <- "hello"
func1(...)
}
> func2()
Error in print(var1) : object 'var1' not found
After reading numerous questions, the language manual, and this really helpful blog post, I came up with the following:
var1 <- "hello"
save(list="var1",file="test.RData")
rm(var1)
func3 <- function(...){
attach("test.RData")
func1(...)
detach("file:test.RData")
}
> func3()
[1] "hello"
Is there a better way to do this? Why doesn't func1
look for undefined variables in the local environment created by func2
, when it was func2
that called func1
?
Note: I did not know how to name this question. If anyone has better suggestions I will change it and edit this line out.
To illustrate lexical scoping, consider the following:
First let's create a sandbox environment, only to avoid the oh-so-common R_GlobalEnv:
sandbox <-new.env()
Now we put two functions inside it: f
, which looks for a variable named x
; and g
, which defines a local x
and calls f
:
sandbox$f <- function()
{
value <- if(exists("x")) x else "not found."
cat("This is function f looking for symbol x:", value, "\n")
}
sandbox$g <- function()
{
x <- 123
cat("This is function g. ")
f()
}
Technicality: entering function definitions in the console causes then to have the enclosing environment set to R_GlobalEnv
, so we manually force the enclosures of f
and g
to match the environment where they "belong":
environment(sandbox$f) <- sandbox
environment(sandbox$g) <- sandbox
Calling g
. The local variable x=123
is not found by f
:
> sandbox$g()
This is function g. This is function f looking for symbol x: not found.
Now we create a x
in the global environment and call g
. The function f
will look for x
first in sandbox, and then in the parent of sandbox, which happens to be R_GlobalEnv:
> x <- 456
> sandbox$g()
This is function g. This is function f looking for symbol x: 456
Just to check that f
looks for x
first in its enclosure, we can put a x
there and call g
:
> sandbox$x <- 789
> sandbox$g()
This is function g. This is function f looking for symbol x: 789
Conclusion: symbol lookup in R follows the chain of enclosing environments, not the evaluation frames created during execution of nested function calls.
EDIT: Just adding a link to this very interesting answer from Martin Morgan on the related subject of parent.frame()
vs parent.env()
You could use closures:
f2 <- function(...){
f1 <- function(...){
print(var1)
}
var1 <- "hello"
f1(...)
}
f2()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With