Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where, if anywhere, are the dangers of non-standard evaluation documented?

Many of R's functions with non-standard evaluation, e.g. with, subset, and transform, contain a warning like this:

For interactive use this is very effective and nice to read. For programming however, i.e., in one's functions, more care is needed, and typically one should refrain from using with(), as, e.g., variables in data may accidentally override local variables, see the reference.
(quoted from the documentation for with, the others are less informative)

"The reference" is this 2003 article. Frankly, I don't see its relevance. It mentions the point about "variables in data may accidentally override local variables" in section 6, but it only does that - mention it. As far as I can see, nothing in that article tells you anything that the warning telling you to check the reference didn't already tell you.

I've searched through the R Manuals, even searching the 3500 page Reference Index for the term "non-standard", but I've come up with nothing other than what I've already mentioned. I really thought that it would be in the language definition, but I've read the whole thing and didn't find it. The closest that I got was the section that covers the substitute function, which I happen to know that a lot of functions with non-standard evaluation rely on.

As for any other places where I'm confident that help cannot be found, I've read both the R FAQ and An Introduction to R from cover to cover. The R FAQ mentions eval and substitute a handful of times, but not in any way that is relevant here. The only notable part was here, which also suggests to check the documentation for deriv, but I found nothing useful there.

So, is there any official part of R where the dangers of non-standard evaluation are actually documented? I find it very strange that parts of R's documentation would tell me to take care with something, without providing any place where I'm told how to do that. It's undeniable that care is needed. For example, Advanced R shows several ways that functions with non-standard evaluation can cause problems. I have paid for such carelessness before and it's not hard to find excellent answers with comments full of warnings about non-standard evaluation.

like image 251
J. Mini Avatar asked Apr 01 '21 15:04

J. Mini


People also ask

What is non standard evaluation?

As the name suggests, non-standard evaluation breaks away from the standard evaluation (SE) rules in order to do something special. There are three common uses of NSE: Labelling enhances plots and tables by using the expressions supplied to a function, rather than their values.

What is tidy evaluation?

Tidy evaluation is a framework for controlling how expressions and variables in your code are evaluated by tidyverse functions. This framework, housed in the rlang package, is a powerful tool for writing more efficient and elegant code.


2 Answers

(Posting as an answer, because this is a bit too long for a comment.)

I don't know of a specific place where the dangers are documented, but from my personal experience, there are two important caveats to keep in mind when working with NSE:

  1. substitute() does not work correctly in nested functions, which leads to problems when trying to do sophisticated things with functions that use substitute(). Examples include glm() and coxph().

  2. If using rlang, the operator !! results in immediate evaluation of its operand w.r.t. the expression as a whole. This can lead to obscure "variable not found" errors, if the expression contains variables that will be defined when other parts of the expression are evaluated.

Outside of those two caveats, I generally find NSE to be very robust. This is especially true if you are using rlang, which goes a long way towards standardizing NSE functionality. With that said, my personal advice is to use NSE only when necessary and stick to standard evaluation (SE) as much as possible. While NSE can be extremely powerful, it produces code that can be hard to read, understand and maintain.

like image 185
Artem Sokolov Avatar answered Oct 27 '22 13:10

Artem Sokolov


I guess section 6.3 "More on Evaluation" of the R language definition says a little about the whole problem.

Another case that occurs frequently is evaluation in a list or a data frame. For instance, this happens in connection with the model.frame function when a data argument is given. Generally, the terms of the model formula need to be evaluated in data, but they may occasionally also contain references to items in the caller of model.frame. This is sometimes useful in connection with simulation studies. So for this purpose one needs not only to evaluate an expression in a list, but also to specify an enclosure into which the search continues if the variable is not in the list. Hence, the call has the form eval(expr, data, sys.frame(sys.parent())).

And then the specific part where the text seems to "warn" the reader:

Notice that evaluation in a given environment may actually change that environment, most obviously in cases involving the assignment operator, such as eval(quote(total <- 0), environment(robert$balance)) # rob Rob. This is also true when evaluating in lists, but the original list does not change because one is really working on a copy.

Maybe it should be improved, because it definitely doesn't approach non-standard evaluation directly, one could say.

like image 2
eduardokapp Avatar answered Oct 27 '22 13:10

eduardokapp