Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting all data.frames in environment to data.tables

Tags:

r

data.table

I get a warning when I use := right after converting all data.frames to data.tables:

library(data.table) #Win R-3.5.1 x64 data.table_1.12.2
df1 <- data.frame(A=1, B=2)
df2 <- data.frame(D=3)
lapply(mget(ls()), function(x) {
    if (is.data.frame(x)) {
        setDT(x)
    }
})
df1[, rn:=.I]

Warning message: In [.data.table(df1, , :=(rn, .I)) : Invalid .internal.selfref detected and fixed by taking a (shallow) copy of the data.table so that := can add this new column by reference. At an earlier point, this data.table has been copied by R (or was created manually using structure() or similar). Avoid names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?setattr. If this message doesn't help, please report your use case to the data.table issue tracker so the root cause can be fixed or this message improved.

The below also generates the same warning:

df3 <- data.frame(E=3)
df4 <- data.frame(FF=4)
for (l in list(df3, df4)) setDT(l)
df3[, rn:=.I]

Typing one by one works but tedious

df5 <- data.frame(G=5)
setDT(df5)
df[, rn := .I]    #no warning

What is the idiomatic way to convert all data.frames to data.tables?

Related:

  1. Using setDT inside a function
  2. Invalid .internal.selfref in data.table
like image 848
chinsoon12 Avatar asked Jul 30 '19 09:07

chinsoon12


People also ask

How do you convert a data frame into a data table?

Method 1 : Using setDT() method table package, which needs to be installed in the working space. The setDT() method can be used to coerce the dataframe or the lists into data. table, where the conversion is made to the original dataframe. The modification is made by reference to the original data structure.

Is data table better than data frame?

frame in R is similar to the data table which is used to create tabular data but data table provides a lot more features than the data frame so, generally, all prefer the data. table instead of the data.

What does setDT do in R?

setDT converts lists (both named and unnamed) and data. frames to data. tables by reference. This feature was requested on Stackoverflow.

Is a data frame the same as a table?

They are similar. Data frames are lists of vectors of equal length while data tables ( data. table ) is an inheritance of data frames. Therefore data tables are data frames but data frames are not necessarily data tables.


1 Answers

setDT operates on the name/symbol, while get returns the value of the object. You can construct the setDT expression and evaluate it:

library(data.table) 
df1 <- data.frame(A=1, B=2)
df2 <- data.frame(D=3)
for(x in ls()){
  if (is.data.frame(get(x))) {
    eval(substitute(setDT(x), list(x=as.name(x))))
  }
}
rm(x)
df1[, rn:=.I]

I would use a loop rather than lapply to avoid complications (eg, with the evaluating environment).

like image 74
Frank Avatar answered Sep 28 '22 10:09

Frank