Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Nested function environment selection

I am writing some functions for doing repeated tasks, but I am trying to minimize the amount of times I load the data. Basically I have one function that takes some information and makes a plot. Then I have a second function that will loop through and output multiple plots to a .pdf. In both functions I have the following line of code:

if(load.dat) load("myworkspace.RData")

where load.dat is a logical and the data I need is stored in myworkspace.RData. When I am calling the wrapper function that loops through and outputs multiple plots I do not want to reload the workspace in every call to the inner function. I thought I could just load the workspace once in the wrapper function, then the inner function could access that data, but I got an error stating otherwise.

So my understanding was when a function cannot find the variable in its local environment (created when the function gets called), the function will look to the parent environment for the variable.

I assumed the parent environment to the inner function call would be the outer function call. Obviously this is not true:

func1 <- function(...){
  print(var1)
}

func2 <- function(...){
  var1 <- "hello"
  func1(...)
}

> func2()
Error in print(var1) : object 'var1' not found

After reading numerous questions, the language manual, and this really helpful blog post, I came up with the following:

var1 <- "hello"
save(list="var1",file="test.RData")
rm(var1)

func3 <- function(...){
  attach("test.RData")
  func1(...)
  detach("file:test.RData")
}

> func3()
[1] "hello"

Is there a better way to do this? Why doesn't func1 look for undefined variables in the local environment created by func2, when it was func2 that called func1?

Note: I did not know how to name this question. If anyone has better suggestions I will change it and edit this line out.

like image 781
dayne Avatar asked Aug 20 '13 14:08

dayne


2 Answers

To illustrate lexical scoping, consider the following:

First let's create a sandbox environment, only to avoid the oh-so-common R_GlobalEnv:

sandbox <-new.env()

Now we put two functions inside it: f, which looks for a variable named x; and g, which defines a local x and calls f:

sandbox$f <- function()
{
    value <- if(exists("x")) x else "not found."
    cat("This is function f looking for symbol x:", value, "\n")
}

sandbox$g <- function()
{
    x <- 123
    cat("This is function g. ")
    f()
}

Technicality: entering function definitions in the console causes then to have the enclosing environment set to R_GlobalEnv, so we manually force the enclosures of f and g to match the environment where they "belong":

environment(sandbox$f) <- sandbox
environment(sandbox$g) <- sandbox

Calling g. The local variable x=123 is not found by f:

> sandbox$g()
This is function g. This is function f looking for symbol x: not found. 

Now we create a x in the global environment and call g. The function f will look for x first in sandbox, and then in the parent of sandbox, which happens to be R_GlobalEnv:

> x <- 456
> sandbox$g()
This is function g. This is function f looking for symbol x: 456 

Just to check that f looks for x first in its enclosure, we can put a x there and call g:

> sandbox$x <- 789
> sandbox$g()
This is function g. This is function f looking for symbol x: 789 

Conclusion: symbol lookup in R follows the chain of enclosing environments, not the evaluation frames created during execution of nested function calls.

EDIT: Just adding a link to this very interesting answer from Martin Morgan on the related subject of parent.frame() vs parent.env()

like image 159
Ferdinand.kraft Avatar answered Sep 19 '22 14:09

Ferdinand.kraft


You could use closures:

f2 <- function(...){
   f1 <- function(...){
     print(var1)
   }
   var1 <- "hello"
   f1(...)
 }
 f2()
like image 20
Karl Forner Avatar answered Sep 19 '22 14:09

Karl Forner