In a dataframe like this:
df <- data.frame(id = c(1,2,3), text = c("hi my name is E","hi what's your name","name here"))
I would like to keep row which contain both hi and name words in a row. Example of expended output:
df <- data.frame(id = c(1,2,3), text = c("hi my name is E","hi what's your name"))
I try this but it doesn't work properly:
library(tidyverse)
df %>%
filter(str_detect(text, 'name&hi'))
One simple answer and two more complex answers you should really only need if you have more than 2 words to check
library(tidyverse)
df %>%
filter(str_detect(text, 'hi') & str_detect(text, 'name'))
df %>%
filter(rowSums(outer(text, c('hi', 'name'), str_detect)) == 2)
df %>%
filter(reduce(c('hi', 'name'), ~ .x & str_detect(text, .y), .init = TRUE))
We can also use regex to specify whether 'hi' follows 'name' or (|
) 'name' follows 'hi
library(dplyr)
library(stringr)
df %>%
filter(str_detect(text, 'hi\\b.*\\bname|name\\b.*\\bhi'))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With