Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weird mapply behaviour: what have I missed?

Tags:

r

The following code does not work as I expected:

a <- list(0, 1)
b <- list(0, 1)

# return a linear function with slope `a` and intercept `b`.
f <- function(a, b) function(x) a*x + b

# create a list of functions with different parameters.
fs <- mapply(f, a, b)

# test
fs[[1]](3)
# [1] 4  # expected zero!
fs[[2]](3)
# [1] 4

Can anyone tell me why?

NB: I've found a workaround, so I'm not looking for a different way to achieve the desired result. But I'm curious as to why this particular approach didn't work.


Update:

As of R 3.2.0, this now works as expected:

a <- list(0, 1)
b <- list(0, 1)
f <- function(a, b) function(x) a*x + b
fs <- mapply(f, a, b)

# test
fs[[1]](3)
# [1] 0 
fs[[2]](3)
# [1] 4
like image 970
pete Avatar asked Dec 09 '11 03:12

pete


2 Answers

This is the result of lazy evaluation -- all arguments are passed down the call tree as promises to avoid unnecessary execution and remain in this suspended state till R is convinced that they are used.

In your code you just populate functions with a same promise to a and same promise to b; then they all got committed to a last pair of vales. As @Tommy already showed, the solution is to force commitment by "using" the value before the function gets defined.

like image 200
mbq Avatar answered Nov 06 '22 07:11

mbq


[Update] My initial analysis was correct but the conclusions were wrong :) Let's get to the conclusions after the analysis.

Here's some code demonstrating the effects:

x <- lapply(1:3, function(x) sys.frame(sys.nframe()))
x[[1]] # An environment
x[[2]] # Another environment
x[[3]] # Yet nother environment
x[[1]]$x  # 3!!! (should be 1)
x[[2]]$x  # 3!!  (should be 2)
x[[3]]$x  # 3 as expected

# Accessing the variable within the function will "fix" the weird behavior:
x <- lapply(1:3, function(x) {x; sys.frame(sys.nframe())})
x[[1]]$x  # 1
x[[2]]$x  # 2
x[[3]]$x  # 3

So the work-around in your case:

f <- function(a, b) { a;b; function(x) a*x + b }

Btw, as @James notes there is a force function that makes accessing a variable more explicit:

f <- function(a, b) { force(a);force(b); function(x) a*x + b }

Conclusions

Well, as @mbq and @hadley noted, this is due to lazy evaluation. It' easier to show with a simple for-loop:

fs <- list(); for(i in 1:2) fs[[i]] <- f(a[[i]], b[[i]])

The function f's x argument will not get the value of a[[i]] (which is 0), but the whole expression and the environment where a and i exist. When you access x, it gets evaluated and therefore uses the i at the time of evaluation. If the for-loop has moved on since the call to f, you get the "wrong" result...

Initially I said that this was due to a bug in *apply, which it isn't. ...but since I hate to be wrong, I can point out that *apply DOES have a bug (or perhaps more of an inconsistency) in these cases:

lapply(11:12, function(x) sys.call())
#[[1]]
#FUN(11:12[[1L]], ...)
#
#[[2]]
#FUN(11:12[[2L]], ...)

lapply(11:12, function(x) function() x)[[1]]() # 12
lapply(11:12, function(x) function() x)[[2]]() # 12

As you see above, the lapply code says it calls the function with 11:12[[1L]]. If you evaluate that "later" you should still get the value 11 - but you actually get 12!

This is probably due to the fact that lapply is implemented in C code for performance reasons and cheat a bit, so the expression that it shows is not the expression that gets evaluated - ergo, a bug...

QED

like image 23
Tommy Avatar answered Nov 06 '22 07:11

Tommy