I am looking to match multiple string criteria and then subset the row in R, using grepl to find the match. I have found a nice solution from another post where some specific code is used (but you get the idea): subset(GEMA_EO5, grepl(paste(l, collapse="|"),GEMA_EO5$RefSeq_ID))
I am wondering if it is possible to grepl in two columns, instead of just RefSeq_ID in the example above. That is, in grepl via any other method. In other words, I would like to look for the options in l not just in one column, but in two (or however many). Is this possible?
eg.: 3 columns, a b and c. I would like to criteria such that T (rows 3 and 4) is selected, despite the format "T I" in (3,b). it should identify both (4,a) and (3,b), hence the link to the previous question. I want it to look in column a AND column b, not one or the other.
a b c
A A C P L
V V B W E E
W T I P J G
T W P J
Here's some demo data to show how this works:
set.seed(1234)
dat <- data.frame(A = sample(letters[1:3],10,TRUE),
B = sample(letters[1:3],10,TRUE))
Using [
to subset makes this a lot more clear in my opinion - we can use grepl
to give a logical vector based on a match, and use |
to combine two tests (on multiple columns). If you wanted a subset of all the rows that contained an 'a' in either column:
dat.a <- dat[with(dat, grepl("a", A)|grepl("a", B)),]
A B
1 b a
2 b a
3 a c
5 a a
9 a a
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With