Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract all characters to the left of a list of possible characters

Tags:

string

r

extract

I have a series of strings in a dataframe like the ones below:

item_time<-c("pink dress july noon", "shirt early september morning", "purple dress 
april", "tall purple shoes february")

And I want to extract all the characters to the left of a list of possible characters like these:

time<-c("january", "january night", "february","march","april","may", "may 
morning", "june","july", "july noon","august","september","early september morning", 
"october","november","december")

The result I want would look like this:

[1] pink dress
[2] shirt
[3] purple dress
[4] tall purple shoes

I can't separate them by spaces as there are varying number of words in the time and item lists. I also don't have a symbol that separates them. I feel that there should be a quite simple and elegant way of solving this but I can't figure it out.

like image 444
Isa Navarro Avatar asked Nov 23 '25 05:11

Isa Navarro


2 Answers

We can use strsplit in Base R:

sapply(strsplit(item_time, split=paste0("\\s", time, collapse="|")), `[`, 1)
# [1] "pink dress"        "shirt"             "purple dress"      "tall purple shoes"

Notes:

I first collapse the time vector and separate each term by |, then use that to split item_time with strsplit. Since the split argument in strsplit accepts regular expressions, it will interpret | as an OR operator effectively spliting item_time whenever it sees one of the terms in time. sapply(...,[, 1) then look at each element of the list and extract the first element, which will be the left most string after the split.

like image 196
acylam Avatar answered Nov 25 '25 19:11

acylam


You can use sub as it is vectorized

sub(paste0("\\s*",time,".*",collapse="|"),"",item_time)
[1] "pink dress"        "shirt"             "purple dress"      "tall purple shoes"
like image 32
KU99 Avatar answered Nov 25 '25 18:11

KU99



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!