Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the difference between substitute and quote in R

Tags:

r

In the official docs, it says:

substitute returns the parse tree for the (unevaluated) expression expr, substituting any variables bound in env.

quote simply returns its argument. The argument is not evaluated and can be any R expression.

But when I try:

> x <- 1
> substitute(x)
x
> quote(x)
x

It looks like both quote and substitute returns the expression that's passed as argument to them.

So my question is, what's the difference between substitute and quote, and what does it mean to "substituting any variables bound in env"?

like image 327
Lifu Huang Avatar asked Oct 19 '17 16:10

Lifu Huang


People also ask

What is use of quote() function in R?

quote() returns an expression: an object that represents an action that can be performed by R. (Unfortunately expression() does not return an expression in this sense. Instead, it returns something more like a list of expressions.

How do you write an expression in R?

Create an Expression in R Programming – expression() Function. expression() function in R Language is used to create an expression from the values passed as argument. It creates an object of the expression class.


3 Answers

Here's an example that may help you to easily see the difference between quote() and substitute(), in one of the settings (processing function arguments) where substitute() is most commonly used:

f <- function(argX) {
   list(quote(argX), 
        substitute(argX), 
        argX)
}
    
suppliedArgX <- 100
f(argX = suppliedArgX)
# [[1]]
# argX
# 
# [[2]]
# suppliedArgX
# 
# [[3]]
# [1] 100
like image 162
Josh O'Brien Avatar answered Oct 18 '22 21:10

Josh O'Brien


R has lazy evaluation, so the identity of a variable name token is a little less clear than in other languages. This is used in libraries like dplyr where you can write, for instance:

summarise(mtcars, total_cyl = sum(cyl))

We can ask what each of these tokens means: summarise and sum are defined functions, mtcars is a defined data frame, total_cyl is a keyword argument for the function summarise. But what is cyl?

> cyl
Error: object 'cyl' not found

It isn't anything! Well, not yet. R doesn't evaluate it right away, but treats it as an expression to be parsed later with some parse tree that is different than the global environment your command line is working in, specifically one where the columns of mtcars are defined. Somewhere in the guts of dplyr, something like this is happening:

> substitute(cyl, mtcars)
[1] 6 6 4 6 8 ...

Suddenly cyl means something. That's what substitute is for.

So what is quote for? Well sometimes you want your lazily-evaluated expression to be represented somewhere else before it's evaluated, i.e. you want to display the actual code you're writing without any (or only some) values substituted. The docs you quoted explain this is common for "informative labels for data sets and plots".

So, for example, you could create a quoted expression, and then both print the unevaluated expression in your chart to show how you calculated and actually calculate with the expression.

expr <- quote(x + y)
print(expr) # x + y
eval(expr, list(x = 1, y = 2)) # 3

Note that substitute can do this expression trick also while giving you the option to parse only part of it. So its features are a superset of quote.

expr <- substitute(x + y, list(x = 1))
print(expr) # 1 + y
eval(expr, list(y = 2)) # 3
like image 20
Chris Avatar answered Oct 18 '22 21:10

Chris


Maybe this section of the documentation will help somewhat:

Substitution takes place by examining each component of the parse tree as follows: If it is not a bound symbol in env, it is unchanged. If it is a promise object, i.e., a formal argument to a function or explicitly created using delayedAssign(), the expression slot of the promise replaces the symbol. If it is an ordinary variable, its value is substituted, unless env is .GlobalEnv in which case the symbol is left unchanged.

Note the final bit, and consider this example:

e <- new.env()
assign(x = "a",value = 1,envir = e)
> substitute(a,env = e)
[1] 1

Compare that with:

> quote(a)
a

So there are two basic situations when the substitution will occur: when we're using it on an argument of a function, and when env is some environment other than .GlobalEnv. So that's why you particular example was confusing.

For another comparison with quote, consider modifying the myplot function in the examples section to be:

myplot <- function(x, y)
    plot(x, y, xlab = deparse(quote(x)),
             ylab = deparse(quote(y)))

and you'll see that quote really doesn't do any substitution.

like image 14
joran Avatar answered Oct 18 '22 21:10

joran