I am facing a problem with regex and strsplit. I would like to split the following x string based on the second : symbol
x <- "26/11/19, 22:16 - Super Mario: It's a me: Super Mario!, but also : the princess"
and obtain then something like this
"26/11/19, 22:16 - Super Mario"
" It's a me: Super Mario!, but also : the princess"
I am using by using strsplit with the following regular expression that in based on my little know-how should reason like "select ONLY the colon symbol followed by a space and preceded by ONLY letters".
I tried to make the regex non greedy with the ? symbol but clearly I am missing something and the result does not work as expected because it includes also me: in the splitting operation.
It is essential I think to have a non greedy operator, because the string here is just an example I do not have always the word Mario of course.
strsplit(x, "(?<=[[:alpha:]]):(?= )", perl = TRUE)
Thank you in andvance!
We can replace the first occurrence of ':' by another character or just replicate it and then use strsplit
strsplit(sub("([[:alpha:]]):", "\\1::", x),
"(?<=[[:alpha:]]):{2,}(?= )", perl = TRUE)[[1]]
#[1] "26/11/19, 22:16 - Super Mario"
#[2] " It's a me: Super Mario!, but also : the princess"
Or with str_split
library(stringr)
str_split(x, "(?<=[[:alpha:]]):(?= )", n = 2)[[1]]
#[1] "26/11/19, 22:16 - Super Mario"
#[2] " It's a me: Super Mario!, but also : the princess"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With