Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Test two columns of strings for match row-wise in R

Let's say I have two columns of strings:

library(data.table)
DT <- data.table(x = c("a","aa","bb"), y = c("b","a","bbb"))

For each row, I want to know whether the string in x is present in column y. A looping approach would be:

for (i in 1:length(DT$x)){
  DT$test[i] <- DT[i,grepl(x,y) + 0]
}

DT
    x   y test
1:  a   b    0
2: aa   a    0
3: bb bbb    1

Is there a vectorized implementation of this? Using grep(DT$x,DT$y) only uses the first element of x.

like image 926
Chris Avatar asked Dec 09 '22 03:12

Chris


2 Answers

You can simply do

DT[, test := grepl(x, y), by = x]
like image 175
David Arenburg Avatar answered Dec 10 '22 17:12

David Arenburg


Or mapply (Vectorize is really just a wrapper for mapply)

DT$test <- mapply(grepl, pattern=DT$x, x=DT$y)
like image 35
Rorschach Avatar answered Dec 10 '22 16:12

Rorschach