Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Once again: Setting the environment within a function

There are many discussions about scoping, environments and functions already. See e.g. here or here. However, I am not sure I have found a good solution to the following problem:

 df <- data.frame(id=rep(LETTERS[1:2],each=2), x=1:4)
 d <- -1
 myfun <- function(df, d){
          require(plyr)
          new.dat <- ddply(df, .(id), transform, x=x*d) 
          return(new.dat)}
 myfun(df, 1)

You can easily verify that the globally defined d=-1 was used, instead of the d=1 as provided in the argument. (If no globally defined d exists, then a object not found message is returned) The big question is now: how do I make the d argument to the function used instead of the globally defined d?

I was under the impression that the following should work:

      myfun2 <- function(df, d){
          here <- environment()
          new.dat <- ddply(df, .(id), transform, x=x*with(here,d)) 
          return(new.dat)}
      myfun2(df, 1)

It is my understanding that with(here, d) retrieves the object d from the environment here. So, the result should be 1. An error is returned, though, saying

  Error in eval(substitute(expr), data, enclos = parent.frame()) : 
   invalid 'envir' argument of type 'closure' 

I am not sure I understand why this does not work, and I would be happy if anyone could shed some light on this, or if you could provide alternative solutions. Note that wrapping the entire ddply-statement into with(...) does not seem to help either.

A solution that does work is to attach the current environment inside the function:

 myfun3 <- function(df, d){
   here <- environment()
   attach(here)
   new.dat <- ddply(df, .(id), transform, x=x*d) 
   detach(here)
   return(new.dat)
 }

but I don't like this solution since it works by masking the globally defined d with the local d, which I think is not very elegant.

Any comments / pointers are appreciated.

like image 921
coffeinjunky Avatar asked Jul 05 '14 14:07

coffeinjunky


1 Answers

To wake up the lazy evaluation and be sure that you are using the local d argument, use force. Add this line:

d <- force(d)

to the start of myfun.


OK, it seems that I misunderstood the problem. In this case, the problem is that ddply has non-standard evaluation and only looks inside df for variables when applying transformations, so it doesn't see the local d even if you force it. As Hadley pointed out, a you need to wrap transform insdie a call to here.

myfun <- function(df, d){
      require(plyr)
      new.dat <- ddply(df, .(id), here(transform), x=x*d) 
      return(new.dat)}

Minor unrelated code improvements:
Since you aren't doing anything with the case when require returns FALSE, you should swap it with library.
mutate is an improved drop-in replacement for alternative to transform.
You don't need the explicit return.

myfun <- function(df, d){
      library(plyr)
      ddply(df, .(id), here(mutate), x=x*d)}
like image 160
3 revs Avatar answered Sep 28 '22 11:09

3 revs