Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split String into Columns by two character markers

All, I have searched around and can't find the answer on how to do this. I am relatively new to R and have not used regular expresions before but bascially I have some data put into a field like this:

"#Route - 6 #Category - PARKING #Details - Parking issues#Result - MOVED ON #Vehicle Type - Mercedes "

I basically want to be able to split the string up into different elements, so each category after the # has it own column.

I tried using the tidyr package and initially tried:

string %>% separate(Description, into  =  c("Route","Details","Result","License No",
                        "Vehicle Desciption"),
                sep = "\n#", remove =F, extra =  "drop")

But realised I only wanted the data after the "-". I tried inserting a "-" in the code but it didn't work. Does anyone know how I can split the string ideally between the "-" and the "#".

Many thanks

like image 441
MrMonkeyBum Avatar asked Jun 18 '26 21:06

MrMonkeyBum


1 Answers

In one line:

> gsub("^\\s+|\\s+$","",gsub(".*?[-]","",unlist(strsplit(str,"#"))))
[1] ""               "6"              "PARKING"        "Parking issues" "MOVED ON"       "Mercedes"  

Or separate for better understanding: Break string by "#":

a = unlist(strsplit(str,"#"))

Remove what is before the "-"

b = gsub(".*?[-]","",a)

Remove leading and trailing spaces:

gsub("^\\s+|\\s+$","",b)
like image 197
Alexey Ferapontov Avatar answered Jun 21 '26 10:06

Alexey Ferapontov



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!