The package data.table
has some special syntax that requires one to use expressions as the i
and j
arguments.
This has some implications for how one write functions that accept and pass arguments to data tables, as is explained really well in section 1.16 of the FAQs.
But I can't figure out how to take this one additional level.
Here is an example. Say I want to write a wrapper function foo()
that makes a specific summary of my data, and then a second wrapper plotfoo()
that calls foo()
and plots the result:
library(data.table)
foo <- function(data, by){
by <- substitute(by)
data[, .N, by=list(eval(by))]
}
DT <- data.table(mtcars)
foo(DT, gear)
OK, this works, because I get my tabulated results:
by N
1: 4 12
2: 3 15
3: 5 5
Now, I try to just the same when writing plotfoo()
but I fail miserably:
plotfoo <- function(data, by){
by <- substitute(by)
foo(data, eval(by))
}
plotfoo(DT, gear)
But this time I get an error message:
Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
OK, so the eval()
is causing a problem. Let's remove it:
plotfoo <- function(data, by){
by <- substitute(by)
foo(data, by)
}
plotfoo(DT, gear)
Oh no, I get a new error message:
Error in `[.data.table`(data, , .N, by = list(eval(by))) :
column or expression 1 of 'by' or 'keyby' is type symbol. Do not quote column names. Useage: DT[,sum(colC),by=list(colA,month(colB))]
And here is where I remain stuck.
Question: How to write a function that calls a function that calls data.table?
To indicate which function to call, just name it (without quotation marks or parentheses). The name of the column of input values is a string that must still appear within quotation marks.
Table function (table())in R performs a tabulation of categorical variable and gives its frequency as output. It is further useful to create conditional frequency table and Proportinal frequency table. This recipe demonstrates how to use table() function to create the following two tables: Frequency table.
On the Data tab, in the Data Tools group or Forecast group (in Excel 2016), click What-If Analysis > Data Table (in the Data Tools group or Forecast group of Excel 2016). In the Row input cell field, enter the reference to the input cell for the input values in the row. Type cell B4 in the Row input cell box.
This will work:
plotfoo <- function(data, by) {
by <- substitute(by)
do.call(foo, list(quote(data), by))
}
plotfoo(DT, gear)
# by N
# 1: 4 12
# 2: 3 15
# 3: 5 5
Explanation:
The problem is that your call to foo()
in plotfoo()
looks like one of the following:
foo(data, eval(by))
foo(data, by)
When foo
processes those calls, it dutifully substitute
s for the second formal argument (by
) getting as by
's value the symbols eval(by)
or by
. But you want by
's value to be gear
, as in the call foo(data, gear)
.
do.call()
solves this problem by evaluating the elements of its second argument before constructing the call that it then evaluates. As a result, when you pass it by
, it evaluates it to its value (the symbol gear
) before constructing a call that looks (essentially) like this:
foo(data, gear)
I think you might be tieing yourself up in knots. This works:
library(data.table)
foo <- function(data, by){
by <- by
data[, .N, by=by]
}
DT <- data.table(mtcars)
foo(DT, 'gear')
plotfoo <- function(data, by){
foo(data, by)
}
plotfoo(DT, 'gear')
And that method supports passing in character values:
> gg <- 'gear'
> plotfoo <- function(data, by){
+ foo(data, by)
+ }
> plotfoo(DT, gg)
gear N
1: 4 12
2: 3 15
3: 5 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With