I have a string s where "substrings" are divided by a pipe. Substrings might or might not contain numbers. And I have a test character string n that contains a number and might or might not contain letters. See example below. Note that spacing can be any
I'm trying to drop all substrings where n is not in a range or is not an exact match. I understand that I need to split by -, convert to numbers, and compare low/high to n converted to numeric. Here's my starting point, but then I got stuck with getting the final good string out of unl_new.
s = "liquid & bar soap 1.0 - 2.0oz | bar 2- 5.0 oz | liquid soap 1-2oz | dish 1.5oz"
n = "1.5oz"
unl = unlist(strsplit(s,"\\|"))
unl_new = (strsplit(unl,"-"))
unl_new = unlist(gsub("[a-zA-Z]","",unl_new))
Desired output:
"liquid & bar soap 1.0 - 2.0oz | liquid soap 1-2oz | dish 1.5oz"
Am I completely on the wrong path? Thanks!
Here an option using r-base ;
## extract the n numeric
nn <- as.numeric(gsub("[^0-9|. ]", "", n))
## keep only numeric and -( for interval)
## and split by |
## for each interval test the condition to create a boolean vector
contains_n <- sapply(strsplit(gsub("[^0-9|. |-]", "", s),'[|]')[[1]],
function(x){
yy <- strsplit(x, "-")[[1]]
yy <- as.numeric(yy[nzchar(yy)])
## the condition
(length(yy)==1 && yy==nn) || length(yy)==2 && nn >= yy[1] && nn <= yy[2]
})
## split again and use the boolean factor to remove the parts
## that don't respect the condition
## paste the result using collapse to get a single character again
paste(strsplit(s,'[|]')[[1]][contains_n],collapse='')
## [1] "liquid & bar soap 1.0 - 2.0oz liquid soap 1-2oz dish 1.5oz"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With