Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using do.call with dplyr standard evaluation version

Tags:

r

dplyr

How can I get a do.call with a variable list of arguments and functions to work with the standard evaluation version of summarise_ in dplyr?

## Some sample data, function, and variables to interpolate
set.seed(0)
dat <- data.frame(a=runif(10), b=runif(10))
fn <- function(x, y) IQR(x / y, na.rm = TRUE)
funs <- list(fn="fn")
targs <- list("a", "b")

This is the lazyeval::interp I'm trying to make work

library(dplyr)
interp(~do.call(fn, xs), .values=list(fn=funs$fn, xs=targs))
# ~do.call("fn", list("a", "b"))

but it doesnt work,

dat %>%
  summarise_(out = interp(~do.call(fn, xs), .values=list(fn=funs$fn, xs=targs)))

Expected result

dat %>%
  summarise(out = do.call(fn, list(a, b)))
#        out
# 1 1.084402

If I add in some print statements, I know the problem is that the "a" and "b" aren't being interpreted properly, but I haven't been able to figure out how to quote them properly.

fn <- function(x, y) { print(x); print(y); IQR(x / y, na.rm = TRUE) }
dat %>%
  summarise_(out = interp(~do.call(fn, xs), fn=funs$fn, xs=targs))
# [1] "a"
# [1] "b"
# Error: non-numeric argument to binary operator
like image 990
Rorschach Avatar asked Nov 14 '15 22:11

Rorschach


People also ask

What special operator is used by dplyr to pass a function argument to one of its methods?

dplyr utilizes pipe operator from another package (magrittr). It allows you to write sub-queries like we do it in sql. Note : All the functions in dplyr package can be used without the pipe operator.

Which of the following functions in dplyr package can be used to choose variables using their names?

across() , relocate() , rename() , select() , and pull() use tidy selection so you can easily choose variables based on their position, name, or type (e.g. starts_with("x") or is.

How many functions are there in dplyr?

These five functions provide the basis of a language of data manipulation.


1 Answers

The targs argument needs to be a call class. The variables in the call (a and b) need to be a name class. All this is done in the second (and third) line below. ?call, ?as.name, and ?is.language might make the line more understandable.

dat <- data.frame(a=runif(10), b=runif(10), grp=rep(1:2, each=5))
targs_quoted = do.call(call, c("list", lapply(targs, as.name)), quote=TRUE)
# In hardcoded form, targs_quoted = quote(list(a, b))
dat %>%
  group_by(grp) %>%
  summarise_(out = interp(~do.call(fn, xs), 
                          .values=list(fn=funs$fn, xs=targs_quoted)))

# Source: local data frame [2 x 2]
#     
#       grp       out
#     (int)     (dbl)
#  1     1  1.0754497
#  2     2  0.9892201

dplyr's "nse" (non-standard evaluation) vignette was very helpful here. I found that the . always referred to the entire table, not the grouped table. That's why some of the recommendations in the comments didn't "work" as wanted.

like image 183
kdauria Avatar answered Oct 26 '22 15:10

kdauria