I get a warning when I use :=
right after converting all data.frames to data.tables:
library(data.table) #Win R-3.5.1 x64 data.table_1.12.2
df1 <- data.frame(A=1, B=2)
df2 <- data.frame(D=3)
lapply(mget(ls()), function(x) {
if (is.data.frame(x)) {
setDT(x)
}
})
df1[, rn:=.I]
Warning message: In
[.data.table
(df1, ,:=
(rn, .I)) : Invalid .internal.selfref detected and fixed by taking a (shallow) copy of the data.table so that := can add this new column by reference. At an earlier point, this data.table has been copied by R (or was created manually using structure() or similar). Avoid names<- and attr<- which in R currently (and oddly) may copy the whole data.table. Use set* syntax instead to avoid copying: ?set, ?setnames and ?setattr. If this message doesn't help, please report your use case to the data.table issue tracker so the root cause can be fixed or this message improved.
The below also generates the same warning:
df3 <- data.frame(E=3)
df4 <- data.frame(FF=4)
for (l in list(df3, df4)) setDT(l)
df3[, rn:=.I]
Typing one by one works but tedious
df5 <- data.frame(G=5)
setDT(df5)
df[, rn := .I] #no warning
What is the idiomatic way to convert all data.frames to data.tables?
Related:
Method 1 : Using setDT() method table package, which needs to be installed in the working space. The setDT() method can be used to coerce the dataframe or the lists into data. table, where the conversion is made to the original dataframe. The modification is made by reference to the original data structure.
frame in R is similar to the data table which is used to create tabular data but data table provides a lot more features than the data frame so, generally, all prefer the data. table instead of the data.
setDT converts lists (both named and unnamed) and data. frames to data. tables by reference. This feature was requested on Stackoverflow.
They are similar. Data frames are lists of vectors of equal length while data tables ( data. table ) is an inheritance of data frames. Therefore data tables are data frames but data frames are not necessarily data tables.
setDT
operates on the name/symbol, while get
returns the value of the object. You can construct the setDT expression and evaluate it:
library(data.table)
df1 <- data.frame(A=1, B=2)
df2 <- data.frame(D=3)
for(x in ls()){
if (is.data.frame(get(x))) {
eval(substitute(setDT(x), list(x=as.name(x))))
}
}
rm(x)
df1[, rn:=.I]
I would use a loop rather than lapply
to avoid complications (eg, with the evaluating environment).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With