Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Variable containing data.table names changed in place? [duplicate]

Tags:

r

data.table

Maybe some can tell me why the names I assigned to "idVars" are changing after adding a column to my data.table (without reassigning them)? How can I persist the assignment to store only the first two column names?

Thanks!

library(data.table)

DT <- data.table(a=1:10, b=1:10)
idVars <- names(DT)
print(idVars)
# [1] "a" "b"

DT[, "c" := 1:10]
print(idVars)
# [1] "a" "b" "c"


# devtools::session_info()                
# data.table * 1.11.6  2018-09-19 CRAN (R 3.5.1)
like image 755
ismirsehregal Avatar asked Apr 08 '26 20:04

ismirsehregal


1 Answers

We can create a copy of the names as the names(DT) and the 'idVars' have the same memory location

tracemem(names(DT))
#[1] "<0x7f9d74c99600>"
tracemem(idVars)
#[1] "<0x7f9d74c99600>"

So, instead create a copy of the names

idVars <- copy(names(DT))
tracemem(idVars)
#[1] "<0x7f9d7d2b97c8>"

and it wouldn't change after the assignment

DT[, "c" := 1:10]
idVars
#[1] "a" "b"

According to ?copy:

A copy() may be required when doing dt_names = names(DT). Due to R's copy-on-modify, dt_names still points to the same location in memory as names(DT). Therefore modifying DT by reference now, say by adding a new column, dt_names will also get updated. To avoid this, one has to explicitly copy: dt_names <- copy(names(DT)).

like image 167
akrun Avatar answered Apr 12 '26 10:04

akrun