Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

can lapply not modify variables in a higher scope

Tags:

function

r

lapply

I often want to do essentially the following:

mat <- matrix(0,nrow=10,ncol=1)
lapply(1:10, function(i) { mat[i,] <- rnorm(1,mean=i)})

But, I would expect that mat would have 10 random numbers in it, but rather it has 0. (I am not worried about the rnorm part. Clearly there is a right way to do that. I am worry about affecting mat from within an anonymous function of lapply) Can I not affect matrix mat from inside lapply? Why not? Is there a scoping rule of R that is blocking this?

like image 413
stevejb Avatar asked Apr 17 '10 00:04

stevejb


People also ask

Can R function access global variable?

As the name suggests, Global Variables can be accessed from any part of the program.

What is the scope of a variable in JavaScript?

JavaScript variables have only two scopes. Global Variables − A global variable has global scope which means it can be defined anywhere in your JavaScript code. Local Variables − A local variable will be visible only within a function where it is defined. Function parameters are always local to that function.


1 Answers

I discussed this issue in this related question: "Is R’s apply family more than syntactic sugar". You will notice that if you look at the function signature for for and apply, they have one critical difference: a for loop evaluates an expression, while an apply loop evaluates a function.

If you want to alter things outside the scope of an apply function, then you need to use <<- or assign. Or more to the point, use something like a for loop instead. But you really need to be careful when working with things outside of a function because it can result in unexpected behavior.

In my opinion, one of the primary reasons to use an apply function is explicitly because it doesn't alter things outside of it. This is a core concept in functional programming, wherein functions avoid having side effects. This is also a reason why the apply family of functions can be used in parallel processing (and similar functions exist in the various parallel packages such as snow).

Lastly, the right way to run your code example is to also pass in the parameters to your function like so, and assigning back the output:

mat <- matrix(0,nrow=10,ncol=1)
mat <- matrix(lapply(1:10, function(i, mat) { mat[i,] <- rnorm(1,mean=i)}, mat=mat))

It is always best to be explicit about a parameter when possible (hence the mat=mat) rather than inferring it.

like image 181
Shane Avatar answered Sep 26 '22 02:09

Shane