I have a vector of character strings:
v1 <- c("Firstname LastnameFirstname Lastname",
"Firstname Lastname",
"Firstname Lastname",
"Firstname LastnameFirstname Lastname")
I'd like to split the string between lowercase letter followed by a capital letter retaining both of the letters.
The desired output would be:
[1] "Firstname Lastname" "Firstname Lastname" "Firstname Lastname" "Firstname Lastname" "Firstname Lastname" "Firstname Lastname"
Following examples in StackExchange I've tried with the strsplit
function with gsub
:
unlist(strsplit( gsub("([a-z][A-Z])","\\1~",v1), "~" ))
but this does not split between the characters, rather after the regex match for split point:
[1] "Firstname LastnameF" "irstname Lastname" "Firstname Lastname" "Firstname Lastname" "Firstname LastnameF" "irstname Lastname"
How do I split between the characters still retaining both of the characters?
We can use regex lookaround to match lower case letters (positive lookbehind - (?<=[a-z])
) followed by upper case letters (positive lookahead -(?=[A-Z])
)
unlist(strsplit(v1, "(?<=[a-z])(?=[A-Z])", perl = TRUE))
#[1] "Firstname Lastname" "Firstname Lastname" "Firstname Lastname"
#[4] "Firstname Lastname" "Firstname Lastname" "Firstname Lastname"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With