Remove elements of a vector that are substrings of another

Name: Chapter 4: substring() and indexOf() methods
Uploaded: 2022-09-14 15:30:29
Description: Remove elements of a vector that are substrings of anotherIs there a better way to achieve this? I'd like to

Question

Is there a better way to achieve this? I'd like to remove all strings from this vector, which are substrings of other elements.

words = c("please can you", 
  "please can", 
  "can you", 
  "how did you", 
  "did you",
  "have you")
> words
[1] "please can you" "please can"     "can you"        "how did you"    "did you"        "have you"

library(data.table)
library(stringr)
dt = setDT(expand.grid(word1 = words, word2 = words, stringsAsFactors = FALSE))
dt[, found := str_detect(word1, word2)]
setdiff(words, dt[found == TRUE & word1 != word2, word2])
[1] "please can you" "how did you"    "have you"

This works, but it seems like overkill and I'm interested to know a more elegant way of doing it.

G. Grothendieck · Accepted Answer

Search for each component of words in words keeping those that occur once:

words[colSums(sapply(words, grepl, words, fixed = TRUE)) == 1]

giving:

[1] "please can you" "how did you"    "have you"

Remove elements of a vector that are substrings of another

Tags:

string

r

Akhil Nair

Video Answer

1 Answers

G. Grothendieck

Recent Activity

Donate For Us

Remove elements of a vector that are substrings of another

Tags:

string

r

Akhil Nair

Video Answer

1 Answers

G. Grothendieck

Related questions

Recent Activity

Donate For Us