Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting String based on letters case

Tags:

r

I want to split the following string

"ATextIWantToDisplayWithSpaces"

like this

A Text I Want To Display With Spaces.

I tried this code in R

strsplit(x="ATextIWantToDisplayWithSpaces", split=[:upper:])

which produces this error

Error: unexpected '[' in "strsplit(x="ATextIWantToDisplayWithSpaces", split=["

Any help will be highly appreciated. Thanks

like image 566
MYaseen208 Avatar asked Nov 03 '11 00:11

MYaseen208


3 Answers

Just do this. It works by (a) locating an upper case letter, (b) capturing it in a group and (c) replacing it with the same with a space preceding it.

gsub('([[:upper:]])', ' \\1', x)
like image 113
Ramnath Avatar answered Sep 30 '22 23:09

Ramnath


An answer to your specific question ("how do I split on uppercase letters"?) is

strsplit(x="ATextIWantToDisplayWithSpaces", split="[[:upper:]]")

but @Ramnath's answer is what you actually want. strsplit throws away the characters on which it splits. The splitByPattern function from R.utils is closer, but it still won't return the results in the most convenient form for you.

like image 27
Ben Bolker Avatar answered Sep 30 '22 22:09

Ben Bolker


I know this is an old one, but I adapted the solution above to one I had where I needed to split the values of a column in a data frame by upper case and then only keep the second element. This solution uses dplyr and purrr:

df %>% mutate(stringvar= map(strsplit(stringvar, "(?!^)(?=[[:upper:]])", perl=T),~.x[2]) %>% unlist())
like image 37
RTutt Avatar answered Sep 30 '22 21:09

RTutt