Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String splitting in R Programming

Tags:

split

r

Currently the script below is splitting a combined item code into a specific item codes.

rule2 <- c("MR")
df_1 <- test[grep(paste("^",rule2,sep="",collapse = "|"),test$Name.y),]

SpaceName_1 <- function(s){
  num <- str_extract(s,"[0-9]+")
  if(nchar(num) >3){
    former <- substring(s, 1, 4)
    latter <- strsplit(substring(s,5,nchar(s)),"")
    latter <- unlist(latter)
    return(paste(former,latter,sep = "",collapse = ","))
  }
  else{
    return (s)
  }
}

df_1$Name.y <- sapply(df_1$Name.y, SpaceName_1)

Example, Combined item code: Room 324-326 is splitting into MR324 MR325 MR326.

However for this particular Combined item code: Room 309-311 is splitting into MR309 MR300 MR301.

How should I amend the script to give me MR309 MR310 MR311?

like image 906
Nina Tan Avatar asked Sep 20 '16 09:09

Nina Tan


2 Answers

You can try something along these lines:

range <- "324-326"
x <- as.numeric(unlist(strsplit(range, split="-")))
paste0("MR", seq(x[1], x[2]))

[1] "MR324" "MR325" "MR326"

I assume that you can obtain the numerical room sequence by some means, and then use the snippet I gave you above.

If your combined item codes always have the form Room xxx-yyy, then you can extract the range using gsub:

range <- gsub("Room ", "", "Room 324-326")

If your item codes were in a vector called codes, then you could obtain a vector of ranges using:

ranges <- sapply(codes, function(x) gsub("Room ", "", x))
like image 53
Tim Biegeleisen Avatar answered Sep 18 '22 05:09

Tim Biegeleisen


We can also evaluate the string after replacing the - with : and then paste the prefix "MR".

paste0("MR", eval(parse(text=sub("\\S+\\s+(\\d+)-(\\d+)", "\\1:\\2", range))))
#[1] "MR324" "MR325" "MR326"

Wrap it as a function for convenience

fChange <- function(prefixStr, RangeStr){ 
 paste0(prefixStr, eval(parse(text=sub("\\S+\\s+(\\d+)-(\\d+)", 
             "\\1:\\2", RangeStr))))
 }

fChange("MR", range)
fChange("MR", range1)
#[1] "MR309" "MR310" "MR311"

For multiple elements, just loop over and apply the function

sapply(c(range, range1), fChange, prefixStr = "MR")

data

range <- "Room 324-326"
range1 <- "Room 309-311"
like image 42
akrun Avatar answered Sep 21 '22 05:09

akrun