Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

:= (pass by reference) operator in the data.table package modifies another data table object simultaneously

While testing my code, I found out the following: If I assign a data.table DT1 to DT and change DT afterwards, DT1 changes with it. So DT and DT1 seem to be internally linked. Is this intended behavior? Although I'm not a programming expert, this looks wrong to me, and testing it with simple R variables or a data.frame, I couldn't reproduce the behavior. What's happening here?

DF <- data.frame(ID=letters[1:5],                   value=1:5) DF1 <- DF all.equal(DF1, DF) [1] TRUE DF[1, "value"] <- DF[1, "value"]*2 all.equal(DF1, DF) [1] "Component 2: Mean relative difference: 1"  library(data.table) data.table 1.7.1  For help type: help("data.table") DT <- data.table(ID=letters[1:5],                   value=1:5) DT1 <- DT all.equal(DT1, DT) [1] TRUE DT[, value:=value*2]      ID value [1,]  a     2 [2,]  b     4 [3,]  c     6 [4,]  d     8 [5,]  e    10 all.equal(DT1, DT) [1] TRUE 
like image 431
Christoph_J Avatar asked Nov 06 '11 21:11

Christoph_J


People also ask

What does := do in data table?

Indicates the rows on which the values must be updated with. If not provided, implies all rows. The := form is more powerful as it allows subsets and joins based add/update columns by reference.

What is data table in r?

data.table is an R package that provides an enhanced version of data.frame s, which are the standard data structure for storing data in base R. In the Data section above, we already created a data.table using fread() . We can also create one using the data.table() function.


1 Answers

This piece of documentation in data.table would help. ? data.table::copy

No value is returned. The data.table is modified by reference. If you require a copy, take a copy first (using DT2=copy(DT)). copy() may also sometimes be useful before := is used to subassign to a column by reference.

like image 162
Ramnath Avatar answered Sep 30 '22 12:09

Ramnath