Is it possible to combine chaining and assignment by reference in a data.table?
For example, I would like to do this:
DT[a == 1][b == 0, c := 2]
However, this leaves the original table unchanged, as a temporary table seems to be created after DT[a == 1] which is subsequently changed and returned.
I would rather not do
DT[a == 1 & b == 0, c := 2]
as this is very slow and I would also rather avoid
DT <- DT[a == 1][b == 0, c := 2]
as I would prefer to do the assignment by reference. This question is part of the question [1], where it is left unanswered.
[1] Conditional binary join and update by reference using the data.table package
Modify / Add / Delete columns To modify an existing column, or create a new one, use the := operator. Using the data. table := operator modifies the existing object 'in place', which has the benefit of being memory-efficient. Memory management is an important aspect of data.
To add row to R Data Frame, append the list or vector representing the row, to the end of the data frame. nrow(df) returns the number of rows in data frame.
I'm not sure why you think that even if DT[a == 1][b == 0, c := 2]
worked in theory it would be more efficient than DT[a == 1 & b == 0, c := 2]
Either way, the most efficient solution in your case would be to key by both a
and b
and conduct the assignment by reference while performing a binary join on both
DT <- data.table(a = c(1, 1, 1, 2, 2), b = c(0, 2, 0, 1, 1)) ## mock data
setkey(DT, a, b) ## keying by both `a` and `b`
DT[J(1, 0), c := 2] ## Update `c` by reference
DT
# a b c
# 1: 1 0 2
# 2: 1 0 2
# 3: 1 2 NA
# 4: 2 1 NA
# 5: 2 1 NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With