All, I have searched around and can't find the answer on how to do this. I am relatively new to R and have not used regular expresions before but bascially I have some data put into a field like this:
"#Route - 6 #Category - PARKING #Details - Parking issues#Result - MOVED ON #Vehicle Type - Mercedes "
I basically want to be able to split the string up into different elements, so each category after the # has it own column.
I tried using the tidyr package and initially tried:
string %>% separate(Description, into = c("Route","Details","Result","License No",
"Vehicle Desciption"),
sep = "\n#", remove =F, extra = "drop")
But realised I only wanted the data after the "-". I tried inserting a "-" in the code but it didn't work. Does anyone know how I can split the string ideally between the "-" and the "#".
Many thanks
In one line:
> gsub("^\\s+|\\s+$","",gsub(".*?[-]","",unlist(strsplit(str,"#"))))
[1] "" "6" "PARKING" "Parking issues" "MOVED ON" "Mercedes"
Or separate for better understanding: Break string by "#":
a = unlist(strsplit(str,"#"))
Remove what is before the "-"
b = gsub(".*?[-]","",a)
Remove leading and trailing spaces:
gsub("^\\s+|\\s+$","",b)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With