Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split a PascalCase string into separate words

Tags:

.net

regex

I am looking for a way to split PascalCase strings, e.g. "MyString", into separate words - "My", "String". Another user posed the question for bash, but I want to know how to do it with general regular expressions or at least in .NET.

Bonus if you can find a way to also split (and optionally capitalize) camelCase strings: e.g. "myString" becomes "my" and "String", with the option to capitalize/lowercase either or both of the strings.

like image 713
Pat Avatar asked Jul 09 '10 19:07

Pat


2 Answers

See this question: Is there a elegant way to parse a word and add spaces before capital letters? Its accepted answer covers what you want, including numbers and several uppercase letters in a row. While this sample has words starting in uppercase, it it equally valid when the first word is in lowercase.

string[] tests = {
   "AutomaticTrackingSystem",
   "XMLEditor",
   "AnXMLAndXSLT2.0Tool",
};


Regex r = new Regex(
    @"(?<=[A-Z])(?=[A-Z][a-z])|(?<=[^A-Z])(?=[A-Z])|(?<=[A-Za-z])(?=[^A-Za-z])"
  );

foreach (string s in tests)
  r.Replace(s, " ");

The above will output:

[Automatic][Tracking][System]
[XML][Editor]
[An][XML][And][XSLT][2.0][Tool]
like image 117
chilltemp Avatar answered Oct 07 '22 00:10

chilltemp


string.Concat(str.Select(x => Char.IsUpper(x) ? " " + x : x.ToString())).TrimStart(' ').Dump();

This is far better approach then using Regex, Dump is just to print to console

like image 22
Sooraj kumar Avatar answered Oct 06 '22 23:10

Sooraj kumar