Seems like an easy one, but... well...
Given a named vector of regular expressions and a data table as follows:
library(data.table)
regexes <- c(a="^A$")
dt <- fread("
a,A,1
a,B,1
b,A,1
")
The input data table is
dt
# V1 V2 V3
# 1: a A 1
# 2: a B 1
# 3: b A 1
My goal for the 1st element in regexes would be:
If V1=="a" set V3:=2. EXCEPT when V2 matches the corresponding regular expression ^A$, then V3:=3.
(a is names(regexes)[1], ^A$ is regexes[1], 2 and 3 are just for demo purpose. I also got more names and regular expressions to loop over, and the data set is about 300.000 rows.)
So the expected output is
# V1 V2 V3
# 1: a A 3 (*)
# 2: a B 2 (**)
# 3: b A 1
(*) 3 because V1 is a and V2 (A) matches the regex,
(**) 2 because V1 is a and V2 (B) does not match ^A$.
I tried to loop through the regexes and pipe the subsetting through like this:
for (x in seq(regexes))
dt[V1==names(regexes)[x], V3:=2][grepl(regexes[x], V2), V3:=3]
However...
dt
# V1 V2 V3
# 1: a A 3
# 2: a B 2
# 3: b A 3 <- wrong, should remain 2
... it does not work as expected, grepl uses the complete V2column, not just the V1=="a" subset. I also tried some other things, which worked, but took too long (i.e. not the way to use data.table).
Question: What would be the best data table way to go here? I'm using packageVersion("data.table") ‘1.9.7’.
Note that I could go the data frame route e.g. like this
df <- as.data.frame(dt)
for (x in seq(regexes)) {
idx <- df$V1==names(regexes)[x]
df$V3[idx] <- 2
df$V3[idx][grepl(regexes[x], df$V2[idx])] <- 3 # or ifelse()
}
But - of course - I would not want to convert the data.table to a data.frame and then back to a data.table if possible.
Thanks in advance!
... it does not work as expected,
grepluses the completeV2column, not just theV1=="a"subset.
I would use stringi, which allows for easy vectorization of regex tests:
library(stringi)
dt[V1 %in% names(regexes),
V3 := V3 + 1L + stri_detect(V2, regex = regexes[V1])
]
V1 V2 V3
1: a A 3
2: a B 2
3: b A 1
The stri_detect family of functions are like grepl from base.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With