Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Precedence of a function call in R

On the standard R help page for operator precedence, they do not include function calls, which seems rather sloppy in my opinion. This was causing me some problems so I decided to just use trial-and-error with substitute and found that the precedence seems to lie between [[ and ^:

> substitute(a^b())[[1]]
`^`
> substitute(a[b]())[[1]]
a[b]

In infix notation, these would be (^ a (b ())) and (([ a b) ()) (denoting the call operator as ()). In plain English, the first example shows that the exponential function is called on arguments a and b() whereas in the second example the final result is a call to the function a[b].

Does this precedence hold in every case? It seems odd that the precedence of a function call wouldn't be constant but it doesn't make sense that it wouldn't be included on the above help page if it was indeed constant.

like image 679
Jon Claus Avatar asked May 15 '14 21:05

Jon Claus


2 Answers

The precedence of a function call is constant

R is very very much like lisp, under the hood.

It has SEXPs like lisp; an SEXP is a tuple (list) where the first element ([[1]]) of the tuple is the operator, and the remaining elements (which are commonly themselves other SEXPs) are the arguments to the operator.

When you write

paste("a",1 + 2)

R understands

(`paste`,"a",(`+`, 1, 2))

When you run substitute, you are getting the SEXPs (although they pretty-print like R code), and the first element of the (outermost) SEXP is the last operator which will be applied in the expression - i.e. the lowest precedence.

As you probably know, you can view the parts of the expression using something like:

> str(as.list(quote(a^b())))
List of 3
 $ : symbol ^
 $ : symbol a
 $ : language b()

To apply this understanding to precedence in your example.

What is the last operator of a^b()?

Let's consider it stepwise

  1. substitute and evaluate a
  2. substitute and evaluate b
  3. evaluate (result of step) 2 with no arguments (this is known as a call)
  4. substitute and evaluate ^
  5. evaluate 4 with arguments 1 and 3

So the last operator is the value named ^

Next, what is the last operator of a[b]()?

  1. substitute and evaluate a
  2. substitute and evaluate b
  3. substitute and evaluate [
  4. evaluate (result of step) 3 with arguments (result of step) 1 and (result of step) 2
  5. evaluate (result of step) 4

In this case (result of step) 4 has the convenient name a[b].

The last operator is therefore a call (evaluation with no arguments) to a[b].


Edit: Caveat

I have simplified the real situation here, because owing to a peculiarity of R, whereby function arguments are passed as unevaluated (environment, expression) pairs to functions (operators), (rather than by reference or by value), while the 'commit' order is roughly the same as the above, the real dispatch order is actually the reverse - or even misses out steps. However, you don't need to worry about that yet.

like image 165
Alex Brown Avatar answered Oct 04 '22 22:10

Alex Brown


Maybe not "precedence" issues but rather parsing issues. (But after thinking about it does seem like precedence and is induced by the need to complete the argument matching of all the arguments between "[" and "]".) In the first instance the parse tree is constructed as:

            `^`
            /  \
           a    b

> substitute(a^b())[1]
`^`()
> substitute(a^b())[[1]]
`^`
> substitute(a^b())[[2]]
a
> substitute(a^b())[[3]]
b()

In the second instance it was constructed as

             a[b]
            /
           NULL

But the first element would also have a structure:

            `[`
            / \
           a   b

> substitute(a[b]())[[1]][[1]]
`[`
> substitute(a[b]())[[1]][[2]]
a
> substitute(a[b]())[[1]][[3]]
b

I'm thinking the ambiguity may occur because of the two functions (^ and [) only the latter could actually deliver a function, so it would need to be processed first. The result of evaluation of a^b could never be a function, so it makes sense to process as ^(a, b() )

When it gets down to actually making something like this actually work, I don't think the second one is very useful. In order to get extraction and substitution from the workspace, you need an extra extraction step:

b <- list(mean)
> eval( substitute(a^b(1:10) , list(a=2) ))
Error in eval(expr, envir, enclos) : could not find function "b"
> eval( substitute(a^b[[1]](1:10) , list(a=2) ))
[1] 45.25483

Following @hadley's suggestion I copied his ast function from pryr and it's companion function, call_tree in the draw_tree.r module at the pryr repository at github. I needed to do it this way since I'm on the road and my laptop is still stuck at an out-of-date R version that doesn't have a binary of pryr. Also needed to install and load pkg:stringr to get str_c.

With that we can see the difference:

ast(a[b]())
\- ()
 \- ()
  \- `[
  \- `a
  \- `b 
ast(a^b())
\- ()
 \- `^
 \- `a
 \- ()
  \- `b 

Pretty slick @hadley.

like image 44
IRTFM Avatar answered Oct 04 '22 20:10

IRTFM