Extract all characters to the left of a list of possible characters

Question

I have a series of strings in a dataframe like the ones below:

item_time<-c("pink dress july noon", "shirt early september morning", "purple dress 
april", "tall purple shoes february")

And I want to extract all the characters to the left of a list of possible characters like these:

time<-c("january", "january night", "february","march","april","may", "may 
morning", "june","july", "july noon","august","september","early september morning", 
"october","november","december")

The result I want would look like this:

[1] pink dress
[2] shirt
[3] purple dress
[4] tall purple shoes

I can't separate them by spaces as there are varying number of words in the time and item lists. I also don't have a symbol that separates them. I feel that there should be a quite simple and elegant way of solving this but I can't figure it out.

acylam · Accepted Answer

We can use strsplit in Base R:

sapply(strsplit(item_time, split=paste0("\s", time, collapse="|")), `[`, 1)
# [1] "pink dress"        "shirt"             "purple dress"      "tall purple shoes"

Notes:

I first collapse the time vector and separate each term by |, then use that to split item_time with strsplit. Since the split argument in strsplit accepts regular expressions, it will interpret | as an OR operator effectively spliting item_time whenever it sees one of the terms in time. sapply(...,[, 1) then look at each element of the list and extract the first element, which will be the left most string after the split.

KU99 · Answer

You can use sub as it is vectorized

sub(paste0("\s*",time,".*",collapse="|"),"",item_time)
[1] "pink dress"        "shirt"             "purple dress"      "tall purple shoes"

Extract all characters to the left of a list of possible characters

Tags:

string

r

extract

Isa Navarro

2 Answers

acylam

KU99

Recent Activity

Donate For Us

Extract all characters to the left of a list of possible characters

Tags:

string

r

extract

Isa Navarro

2 Answers

acylam

KU99

Related questions

Recent Activity

Donate For Us