I am reading Hadley Wickhams's book on Github, in particular this part on lazy evaluation. There he gives an example of consequences of lazy evaluation, in the part with add/adders
functions. Let me quote that bit:
This [lazy evaluation] is important when creating closures with lapply or a loop:
add <- function(x) { function(y) x + y } adders <- lapply(1:10, add) adders[[1]](10) adders[[10]](10)
x is lazily evaluated the first time that you call one of the adder functions. At this point, the loop is complete and the final value of x is 10. Therefore all of the adder functions will add 10 on to their input, probably not what you wanted! Manually forcing evaluation fixes the problem:
add <- function(x) { force(x) function(y) x + y } adders2 <- lapply(1:10, add) adders2[[1]](10) adders2[[10]](10)
I do not seem to understand that bit, and the explanation there is minimal. Could someone please elaborate that particular example, and explain what happens there? I am specifically puzzled by the sentence "at this point, the loop is complete and the final value of x is 10". What loop? What final value, where? Must be something simple I am missing, but I just don't see it. Thanks a lot in advance.
In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed (non-strict evaluation) and which also avoids repeated evaluations (sharing).
In Spark, Lazy Evaluation means that You can apply as many TRANSFORMATIONs as you want, but Spark will not start the execution of the process until an ACTION is called. 💡 So transformations are lazy but actions are eager.
Lazy evaluation or call-by-need is a evaluation strategy where an expression isn't evaluated until its first use i.e to postpone the evaluation till its demanded.
Always using lazy evaluation also implies early optimization. This is a bad practice which often results in code which is much more complex and expensive that might otherwise be the case.
This is no longer true as of R 3.2.0!
The corresponding line in the change log reads:
Higher order functions such as the apply functions and Reduce() now force arguments to the functions they apply in order to eliminate undesirable interactions between lazy evaluation and variable capture in closures.
And indeed:
add <- function(x) { function(y) x + y } adders <- lapply(1:10, add) adders[[1]](10) # [1] 11 adders[[10]](10) # [1] 20
The goal of:
adders <- lapply(1:10, function(x) add(x) )
is to create a list of add
functions, the first adds 1 to its input, the second adds 2, etc. Lazy evaluation causes R to wait for really creating the adders functions until you really start calling the functions. The problem is that after creating the first adder function, x
is increased by the lapply
loop, ending at a value of 10. When you call the first adder function, lazy evaluation now builds the function, getting the value of x
. The problem is that the original x
is no longer equal to one, but to the value at the end of the lapply
loop, i.e. 10.
Therefore, lazy evaluation causes all adder functions to wait until after the lapply
loop has completed in really building the function. Then they build their function with the same value, i.e. 10. The solution Hadley suggests is to force x
to be evaluated directly, avoiding lazy evaluation, and getting the correct functions with the correct x
values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With