Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does R know not to use the old 'f'?

Tags:

r

Into the R console, type:

#First code snippet
x <- 0
x <- x+1
x

You'll get '1'. That makes sense: the idea is that the 'x' in 'x+1' is the current value of x, namely 0, and this is used to compute the value of x+1, namely 1, which is then shoveled into the container x. So far, so good.

Now type:

#Second code snippet
f <- function(n) {n^2}
f <- function(n) {if (n >= 1) {n*f(n-1)} else {1}}
f(5)

You'll get '120', which is 5 factorial.

I find this perplexing. Following the logic of the first code snippet, we might expect the 'f' in the expression

if (n >= 1) {n*f(n-1)} else {1}

to be interpreted as the current value of f, namely

function(n) {n^2}

Following this reasoning, the value of f(5) should be 5*(5-1)^2 = 80. But that's not what we get.

Question. What's really going on here? How does R know not to use the old 'f'?

like image 707
goblin GONE Avatar asked Dec 05 '22 17:12

goblin GONE


2 Answers

we might expect the 'f' in the expression

if (n >= 1) {n*f(n-1)} else {1}

to be interpreted as the current value of f

— Yes, we might expect that. And we would be correct.

But what is the “current value of f”? Or, more precisely, what is “current”?

“Current” is when the function is executed, not when it is defined. That is, by the time you execute f(5), it has already been redefined. So now the execution enters the function, looks up inside the function what f refers to — and also finds the current (= new) definition, not the old one.

In other words: the objects associated with names are looked up when they are actually accessed. And inside a function this means that names are accessed when the function is executed, not when it’s defined.

The same is true for all objects. Let’s say f is using a global object that’s not a function:

n = 5
f = function() n ^ 2

n = 1
f() # = 1

To understand the difference between your first and second example, consider the following case which involved functions, yet behaves like your first case (i.e. it uses the “old” value of f).

To make the example work, we need a little helper: a function that modifies other functions. In the following, twice is a function which takes a function as an argument and returns a new function. That new function is the same as the old function, only it runs twice when invoked:

twice = function (original_function) {
    force(original_function)
    function (...) {
        original_function(original_function(...))
    }
}

To illustrate what twice does, let’s invoke it on an example function:

plus1 = function (n) n + 1
plus2 = twice(plus1)
plus2(3) # = 5

Neat — R allows us to handle functions like any other object!

Now let’s modify your f:

f = function(n) {n^2}
f = twice(f)
f(5) # 625

… and here we have it: in the statement f = twice(f), the second f refers to the current (= old) definition. Only after that line does f refer to the new, modified function.

like image 183
Konrad Rudolph Avatar answered Dec 27 '22 10:12

Konrad Rudolph


Here's a simple example illustrating my comment on Konrad's excellent answer:

a <- 2
f <- function() a*b

e <- new.env()
assign("b",5,e)
environment(f) <- e

> f()
[1] 10

b <- 10

> f()
[1] 10

So we've manually altered the environment for f so that it always first looks in e for b. Theoretically, one could even lock that binding ?lockBinding to make sure it never changes without throwing an error.

This sort of thing could get complicated, though, as in general you'd want to make sure that you set the parent environment of e correctly based on where the function f is actually being created. In this example f is created in the global environment, but if f were being created inside another function, you'd want e's parent environment to reflect that.

like image 24
joran Avatar answered Dec 27 '22 11:12

joran