I have a dataframe that is in this format:
A <- c("John Smith", "Red Shirt", "Family values are better")
B <- c("John is a very highly smart guy", "We tried the tea but didn't enjoy it at all", "Family is very important as it gives you values")
df <- as.data.frame(A, B)
My intention is to get the result back as:
ID A B
1 John Smith is a very highly smart guy
2 Red Shirt We tried the tea but didn't enjoy it at all
3 Family values are better is very important as it gives you
I have tried:
test<-df %>% filter(sapply(1:nrow(.), function(i) grepl(A[i], B[i])))
But it doesn't give me what I want.
Any suggestions/help?
One solution is to use mapply
along with strsplit
.
The trick is to split df$A
in separate words and collapse those words separated by |
and then use it as pattern
in gsub
to replace with ""
.
lst <- strsplit(df$A, split = " ")
df$B <- mapply(function(x,y){gsub(paste0(x,collapse = "|"), "",df$B[y])},lst,1:length(lst))
df
# A B
# 1 John Smith is a very highly smart guy
# 2 Red Shirt We tried the tea but didn't enjoy it at all
# 3 Family values are better is very important as it gives you
Another option is as:
df$B <- mapply(function(x,y)gsub(x,"",y) ,gsub(" ", "|",df$A),df$B)
Data:
A <- c("John Smith", "Red Shirt", "Family values are better")
B <- c("John is a very highly smart guy", "We tried the tea but didn't enjoy it at all", "Family is very important as it gives you values")
df <- data.frame(A, B, stringsAsFactors = FALSE)
Just another option using stringr::str_split_fixed
function:
library(stringr)
str_split_fixed(sapply(paste(df$A,df$B, sep=" columnbreaker "),
function(i){
paste(unique(
strsplit(as.character(i), split=" ")[[1]]),
collapse = " ")}),
" columnbreaker ", 2)
# [,1] [,2]
# [1,] "John Smith" "is a very highly smart guy"
# [2,] "Red Shirt" "We tried the tea but didn't enjoy it at all"
# [3,] "Family values are better" "is very important as it gives you"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With