Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to not fall into R's 'lazy evaluation trap'

"R passes promises, not values. The promise is forced when it is first evaluated, not when it is passed.", see this answer by G. Grothendieck. Also see this question referring to Hadley's book.

In simple examples such as

> funs <- lapply(1:10, function(i) function() print(i))
> funs[[1]]()
[1] 10
> funs[[2]]()
[1] 10

it is possible to take such unintuitive behaviour into account.

However, I find myself frequently falling into this trap during daily development. I follow a rather functional programming style, which means that I often have a function A returning a function B, where B is in some way depending on the parameters with which A was called. The dependency is not as easy to see as in the above example, since calculations are complex and there are multiple parameters.

Overlooking such an issue leads to difficult to debug problems, since all calculations run smoothly - except that the result is incorrect. Only an explicit validation of the results reveals the problem.

What comes on top is that even if I have noticed such a problem, I am never really sure which variables I need to force and which I don't.

How can I make sure not to fall into this trap? Are there any programming patterns that prevent this or that at least make sure that I notice that there is a problem?

like image 496
jhin Avatar asked Mar 16 '15 18:03

jhin


3 Answers

You are creating functions with implicit parameters, which isn't necessarily best practice. In your example, the implicit parameter is i. Another way to rework it would be:

library(functional)
myprint <- function(x) print(x)
funs <- lapply(1:10, function(i) Curry(myprint, i))
funs[[1]]()
# [1] 1
funs[[2]]()
# [1] 2

Here, we explicitly specify the parameters to the function by using Curry. Note we could have curried print directly but didn't here for illustrative purposes.

Curry creates a new version of the function with parameters pre-specified. This makes the parameter specification explicit and avoids the potential issues you are running into because Curry forces evaluations (there is a version that doesn't, but it wouldn't help here).

Another option is to capture the entire environment of the parent function, copy it, and make it the parent env of your new function:

funs2 <- lapply(
  1:10, function(i) {
    fun.res <- function() print(i)
    environment(fun.res) <- list2env(as.list(environment()))  # force parent env copy
    fun.res
  }
)
funs2[[1]]()
# [1] 1
funs2[[2]]()
# [1] 2

but I don't recommend this since you will be potentially copying a whole bunch of variables you may not even need. Worse, this gets a lot more complicated if you have nested layers of functions that create functions. The only benefit of this approach is that you can continue your implicit parameter specification, but again, that seems like bad practice to me.

like image 75
BrodieG Avatar answered Oct 23 '22 06:10

BrodieG


As others pointed out, this might not be the best style of programming in R. But, one simple option is to just get into the habit of forcing everything. If you do this, realize you don't need to actually call force, just evaluating the symbol will do it. To make it less ugly, you could make it a practice to start functions like this:

myfun<-function(x,y,z){
   x;y;z;
   ## code
}
like image 20
mrip Avatar answered Oct 23 '22 05:10

mrip


There is some work in progress to improve R's higher order functions like the apply functions, Reduce, and such in handling situations like these. Whether this makes into R 3.2.0 to be released in a few weeks depend on how disruptive the changes turn out to be. Should become clear in a week or so.

like image 20
Luke Tierney Avatar answered Oct 23 '22 06:10

Luke Tierney