Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R data.table breaks in exported functions

I'm having a problem getting data.table to work in roxygen2 exported functions.

Here's a simple, fake function in a file called foo.R (located in the R directory of my package) which uses data.table:

#' Data.table test function
#' @export
foo <- function() {
  m <- data.table(c1 = c(1,2,3))
  print(is.data.table(m))
  m[,sum(c1)]
}

If I copy and paste this function into R, this function works fine:

> foo <- function() {
+   m <- data.table(c1 = c(1,2,3))
+   print(is.data.table(m))
+   m[,sum(c1)]
+ }
> foo()
[1] TRUE
[1] 6

But if I simply load the exported function, R thinks that the data.table is a data.frame and breaks:

> rm(foo)
> load_all()
Loading test_package
> foo
function() {
  m <- data.table(c1 = c(1,2,3))
  print(is.data.table(m))
  m[,sum(c1)]
}
<environment: namespace:test_package>
> foo()
[1] TRUE
Error in `[.data.frame`(x, i, j) : object 'c1' not found

What's up?

UPDATE

Thanks to @GSee for the help. Looks like this is actually a devtools issue. Check out the interactive command line code below.

After loading the test_package library, foo runs correctly:

> foo
function ()
{
    m <- data.table(c1 = c(1, 2, 3))
    print(is.data.table(m))
    m[, sum(c1)]
}
<environment: namespace:test_package>
> foo()
[1] TRUE
[1] 6

Running load_all() breaks foo:

> load_all()
Loading test_package
> foo()
[1] TRUE
Error in `[.data.frame`(x, i, j) : object 'c1' not found

Somehow source('R/foo.R') revives foo functionality:

> source('R/foo.R')
> foo
function() {
  m <- data.table(c1 = c(1,2,3))
  print(is.data.table(m))
  m[,sum(c1)]
}
> foo()
[1] TRUE
[1] 6

And future calls to load_all() don't break foo again:

> load_all()
Loading test_package
> foo
function() {
  m <- data.table(c1 = c(1,2,3))
  print(is.data.table(m))
  m[,sum(c1)]
}
> foo()
[1] TRUE
[1] 6

Also, I updated to devtools 1.5 and tried adding .datatable.aware=TRUE, but that didn't seem to do anything.

like image 882
kjmij Avatar asked Apr 23 '14 18:04

kjmij


1 Answers

The issue, as @GSee pointed out (under comments) seems to be this issue still.

In order to find out if a package is data.table aware, data.table calls the function cedta(), which is:

> data.table:::cedta
function (n = 2L) 
{
    te = topenv(parent.frame(n))
    if (!isNamespace(te)) 
        return(TRUE)
    nsname = getNamespaceName(te)
    ans = nsname == "data.table" || "data.table" %chin% names(getNamespaceImports(te)) || 
        "data.table" %chin% tryCatch(get(".Depends", paste("package", 
            nsname, sep = ":"), inherits = FALSE), error = function(e) NULL) || 
        (nsname == "utils" && exists("debugger.look", parent.frame(n + 
            1L))) || nsname %chin% cedta.override || identical(TRUE, 
        tryCatch(get(".datatable.aware", asNamespace(nsname), 
            inherits = FALSE), error = function(e) NULL))
    if (!ans && getOption("datatable.verbose")) 
        cat("cedta decided '", nsname, "' wasn't data.table aware\n", 
            sep = "")
    ans
}
<bytecode: 0x7ff67b9ca190>
<environment: namespace:data.table>

The relevant check here is:

"data.table" %chin% get(".Depends", paste("package", nsname, sep=":"), inherits=FALSE)

When a package depends on data.table, the above command should return TRUE - that is, if you installed the package via R CMD INSTALL and then loaded the package. This is because, when you load the package, R by default creates a ".Depends" variable in the namespace as well. If you did:

ls("package:test", all=TRUE)
# [1] ".Depends" "foo"     

However, when you do devtools:::load_all(), this variable doesn't seem to be set.

# new session + set path to package's dir
devtools:::load_all()
ls("package:test", all=TRUE)
# [1] "foo"

So, cedta() doesn't get to know that this package indeed depends on data.table. However, when you manually set .datatable.aware=TRUE, the line:

identical(TRUE, get(".datatable.aware", asNamespace(nsname), inherits = FALSE))

gets executed, which will return TRUE and therefore overcomes the issue. But the fact that devtools doesn't place the .Depends variable in the package's namespace is still there.

All in all, this is really not an issue with data.table.

like image 178
Arun Avatar answered Oct 30 '22 15:10

Arun