Please have a look at the following sample code.
DT <-data.table(1:15,0,rbinom(15,2,0.5))
I can filter by condition DT[V3 == 1,]
or select rows by index DT[1:5,]
.
How can I do both? In the following code, the sequence of the indexed rows seems to by ignored:
DT[V3 == 1 & 1:5]
I could do DT[1:5,][V3 == 1]
, but then, for example, I wouldn't be able to modify the filtered rows:
DT[1:5,][V3 == 1, V2 := 1]
This only works with the following workaround:
DT[V3 == 1 & DT[,.I <= 5], V2 := 1]
However, this looks too data.frame-ish to me. Is there a more elegant way and why does DT[V3 == 1 & 1:5]
not work?
Here's a faster way for @akrun's example:
set.seed(24)
DT <- data.table(1:1e6, 0, rbinom(1e6, 2, 0.5))
DT1 <- copy(DT)
DT2 <- copy(DT)
library(microbenchmark)
microbenchmark(
DT1[which(V3[1:5]==1L), V2:= 1],
DT2[intersect(which(V3==1), 1:5), V2 := 1]
, times = 1, unit = "relative" )
# Unit: relative
# expr min lq mean median uq max neval
# sequential 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1
# set_ops 55.43582 55.43582 55.43582 55.43582 55.43582 55.43582 1
It's "sequential" in the sense that we subset by index before evaluating the condition.
The generalization is
cond = quote(V3 == 1)
indx = 1:5
DT[ DT[indx, indx[eval(cond)]], V2 := 1]
# or
set(DT, i = DT[indx, indx[eval(cond)]], j = "V2", v = 1)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With