Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cannot use dput for data.table in R

Tags:

r

data.table

I have following data.table for which I cannot use output of dput command to recreate it:

> ddt
   Unit Anything index new
1:    A      3.4     1   1
2:    A      6.9     2   1
3:   A1      1.1     1   2
4:   A1      2.2     2   2
5:    B      2.0     1   3
6:    B      3.0     2   3
> 
> 
> str(ddt)
Classes ‘data.table’ and 'data.frame':  6 obs. of  4 variables:
 $ Unit    : Factor w/ 3 levels "A","A1","B": 1 1 2 2 3 3
 $ Anything: num  3.4 6.9 1.1 2.2 2 3
 $ index   : num  1 2 1 2 1 2
 $ new     : int  1 1 2 2 3 3
 - attr(*, ".internal.selfref")=<externalptr> 
 - attr(*, "sorted")= chr  "Unit" "Anything"
> 
> 
> dput(ddt)
structure(list(Unit = structure(c(1L, 1L, 2L, 2L, 3L, 3L), .Label = c("A", 
"A1", "B"), class = "factor"), Anything = c(3.4, 6.9, 1.1, 2.2, 
2, 3), index = c(1, 2, 1, 2, 1, 2), new = c(1L, 1L, 2L, 2L, 3L, 
3L)), .Names = c("Unit", "Anything", "index", "new"), row.names = c(NA, 
-6L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x8948f68>, sorted = c("Unit", 
"Anything"))
> 

On pasting I get following error:

> dt = structure(list(Unit = structure(c(1L, 1L, 2L, 2L, 3L, 3L), .Label = c("A", 
+ "A1", "B"), class = "factor"), Anything = c(3.4, 6.9, 1.1, 2.2, 
+ 2, 3), index = c(1, 2, 1, 2, 1, 2), new = c(1L, 1L, 2L, 2L, 3L, 
+ 3L)), .Names = c("Unit", "Anything", "index", "new"), row.names = c(NA, 
+ -6L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x8948f68>, sorted = c("Unit", 
Error: unexpected '<' in:
"3L)), .Names = c("Unit", "Anything", "index", "new"), row.names = c(NA, 
-6L), class = c("data.table", "data.frame"), .internal.selfref = <"
> "Anything"))
Error: unexpected ')' in ""Anything")"

Where is the problem and how can it be corrected? Thanks for your help.

like image 360
rnso Avatar asked Aug 27 '14 17:08

rnso


People also ask

What is the use of table in R?

Conclusion data.table is a package is used for working with tabular data in R. It provides the efficient data.table object which is a much improved version of the default data.frame. It is super fast and has intuitive and terse syntax.

What is DataPivot table in R?

Pivot Table operations 22. Conclusion data.table is a package is used for working with tabular data in R. It provides the efficient data.table object which is a much improved version of the default data.frame. It is super fast and has intuitive and terse syntax.

How to install data table package in R?

Installing data.table package is no different from other R packages. Its recommended to run install.packages() to get the latest version on the CRAN repository. However, if you want to use the latest development version, you can get it from github as well.

What is DataTable in R?

The data.table is an alternative to R’s default data.frame to handle tabular data. The reason it’s so popular is because of the speed of execution on larger data and the terse syntax.


2 Answers

The problem is that dput prints out external pointer address (this is something that data.table uses internally, and will reconstruct when required), which you can't really use.

If you manually cut out the .internal.selfref part, it will work just fine, except for a one-time complaint from data.table for some operations.

You could add an FR to data.table about this, but it will require modifying the base function from data.table, similar to how rbind is currently handled.

like image 148
eddi Avatar answered Oct 27 '22 01:10

eddi


I have also found this behavior rather annoying. So I have created my own dput function that ignores the .internal.selfref attribute.

dput <- function (x, file = "", control = c("keepNA", "keepInteger", 
                                    "showAttributes")) 
{
  if (is.character(file)) 
    if (nzchar(file)) {
      file <- file(file, "wt")
      on.exit(close(file))
    }
  else file <- stdout()
  opts <- .deparseOpts(control)
  # adding these three lines for data.tables
  if (is.data.table(x)) {
    setattr(x, '.internal.selfref', NULL)
  }
  if (isS4(x)) {
    clx <- class(x)
    cat("new(\"", clx, "\"\n", file = file, sep = "")
    for (n in .slotNames(clx)) {
      cat("    ,", n, "= ", file = file)
      dput(slot(x, n), file = file, control = control)
    }
    cat(")\n", file = file)
    invisible()
  }
  else .Internal(dput(x, file, opts))
}
like image 33
shadow Avatar answered Oct 27 '22 01:10

shadow