Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write a function that calls a function that calls data.table?

Tags:

r

data.table

The package data.table has some special syntax that requires one to use expressions as the i and j arguments.

This has some implications for how one write functions that accept and pass arguments to data tables, as is explained really well in section 1.16 of the FAQs.

But I can't figure out how to take this one additional level.

Here is an example. Say I want to write a wrapper function foo() that makes a specific summary of my data, and then a second wrapper plotfoo() that calls foo() and plots the result:

library(data.table)


foo <- function(data, by){
  by <- substitute(by)
  data[, .N, by=list(eval(by))]
}

DT <- data.table(mtcars)
foo(DT, gear)

OK, this works, because I get my tabulated results:

   by  N
1:  4 12
2:  3 15
3:  5  5

Now, I try to just the same when writing plotfoo() but I fail miserably:

plotfoo <- function(data, by){
  by <- substitute(by)
  foo(data, eval(by))
}
plotfoo(DT, gear)

But this time I get an error message:

Error: evaluation nested too deeply: infinite recursion / options(expressions=)?

OK, so the eval() is causing a problem. Let's remove it:

plotfoo <- function(data, by){
  by <- substitute(by)
  foo(data, by)
}
plotfoo(DT, gear)

Oh no, I get a new error message:

Error in `[.data.table`(data, , .N, by = list(eval(by))) : 
  column or expression 1 of 'by' or 'keyby' is type symbol. Do not quote column names. Useage: DT[,sum(colC),by=list(colA,month(colB))]

And here is where I remain stuck.

Question: How to write a function that calls a function that calls data.table?

like image 955
Andrie Avatar asked Feb 12 '13 17:02

Andrie


People also ask

How do you call a column in a function?

To indicate which function to call, just name it (without quotation marks or parentheses). The name of the column of input values is a string that must still appear within quotation marks.

What is the table function in R?

Table function (table())in R performs a tabulation of categorical variable and gives its frequency as output. It is further useful to create conditional frequency table and Proportinal frequency table. This recipe demonstrates how to use table() function to create the following two tables: Frequency table.

How do you use a data table?

On the Data tab, in the Data Tools group or Forecast group (in Excel 2016), click What-If Analysis > Data Table (in the Data Tools group or Forecast group of Excel 2016). In the Row input cell field, enter the reference to the input cell for the input values in the row. Type cell B4 in the Row input cell box.


2 Answers

This will work:

plotfoo <- function(data, by) {
  by <- substitute(by)
  do.call(foo, list(quote(data), by))
}

plotfoo(DT, gear)
#    by  N
# 1:  4 12
# 2:  3 15
# 3:  5  5

Explanation:

The problem is that your call to foo() in plotfoo() looks like one of the following:

foo(data, eval(by))
foo(data, by)

When foo processes those calls, it dutifully substitutes for the second formal argument (by) getting as by's value the symbols eval(by) or by. But you want by's value to be gear, as in the call foo(data, gear).

do.call() solves this problem by evaluating the elements of its second argument before constructing the call that it then evaluates. As a result, when you pass it by, it evaluates it to its value (the symbol gear) before constructing a call that looks (essentially) like this:

foo(data, gear)
like image 115
Josh O'Brien Avatar answered Oct 22 '22 11:10

Josh O'Brien


I think you might be tieing yourself up in knots. This works:

library(data.table)
foo <- function(data, by){
  by <- by
  data[, .N, by=by]
}

DT <- data.table(mtcars)
foo(DT, 'gear')

plotfoo <- function(data, by){
  foo(data, by)
}
plotfoo(DT, 'gear')

And that method supports passing in character values:

> gg <- 'gear'
> plotfoo <- function(data, by){
+   foo(data, by)
+ }
> plotfoo(DT, gg)
   gear  N
1:    4 12
2:    3 15
3:    5  5
like image 20
IRTFM Avatar answered Oct 22 '22 11:10

IRTFM