I'm trying to understand just how lazy evaluation in R works. Does it only apply to the evaluation of function arguments? Because that I understand, e.g.
f <- function(x = x, y = x*2) {
c(x, y)
}
f(2)
[1] 2 4
But in other languages, e.g. Haskell, lazy evaluation means that a function call only gets evaluated if it's ever actually used. So I would expect something like this to run in an instant:
g <- function(x) {
y <- sample(1:100000000)
return(x)
}
g(4)
But it clearly evaluates the sample
call even though its result doesn't get used.
Could somebody explain exactly how this works, or point me in the direction of where it is explained in detail?
Similar questions:
Question with similar wording, but different problem
In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed (non-strict evaluation) and which also avoids repeated evaluations (sharing).
Lazy evaluation's is not always better. The performance benefits of lazy evaluation can be great, but it is not hard to avoid most unnecessary evaluation in eager environments- surely lazy makes it easy and complete, but rarely is unnecessary evaluation in code a major problem.
Lazy evaluations are one of the Functional Programming techniques for implementing the efficient code. So, almost every Functional Programming language supports the lazy evaluation.
Languages that support lazy evaluation are usually functional programming languages like Haskell, which is lazy by default. Some languages, like OCaml and Scheme, let you opt into lazy behavior. Other languages like Swift and Perl 6 support it only for lists.
As you already have found out, R does not use lazy evaluation in the general sense. But R does provides that functionality, if you need it, by the function delayedAssign()
as shown below:
> system.time(y <- sample(1E8))
user system elapsed
7.636 0.128 7.766
> system.time(length(y))
user system elapsed
0 0 0
system.time(delayedAssign("x", sample(1E8)))
user system elapsed
0.000 0.000 0.001
> system.time(length(x))
user system elapsed
7.680 0.096 7.777
As you can see, y
is evaluated immediately, so to determine the length of y
takes no time at all. x
on the other hand, is not evaluated when it is created, only a promise to evaluate x
is returned by delayedAssign()
, and only when we actually need a value of x
, in this case to determine its length, x
is evaluated.
It does not matter if the expression is placed in a function or executed in the global enviroment, so the encapsulation of the expression within a function which you did in your example, does not really add anything, which is why I excluded it. But if you want to be sure, try:
a.f <- function(z) { delayedAssign("x", sample(1E8)); return(z+1) }
system.time(a.f(0))
user system elapsed
0 0 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With