I want to split <code>camelCase</code> or <code>PascalCase</code> words to space separate collection of words. So far, I have: <pre class="prettyprint"><code>Regex.Replace(value, @"(\B[A-Z]+?(?=[A-Z][^A-Z])|\B[A-Z]+?(?=[^A-Z]))", " $0", RegexOptions.Compiled); </code></pre> It works fine for converting "TestWord" to "Test Word" and for leaving single words untouched, e.g. <code>Testing</code> remains <code>Testing</code>. However, <code>ABCTest</code> gets converted to <code>A B C Test</code> when I would prefer <code>ABC Test</code>.

Try: <pre class="prettyprint"><code>[A-Z][a-z]+|[A-Z]+(?=[A-Z][a-z])|[a-z]+|[A-Z]+ </code></pre> An example on Regex101 <hr> <h3>How is it used in CS?</h3> <pre class="prettyprint"><code>string strText = " TestWord asdfDasdf ABCDef"; string[] matches = Regex.Matches(strText, @"[A-Z][a-z]+|[A-Z]+(?=[A-Z][a-z])|[a-z]+|[A-Z]+") .Cast<Match>() .Select(m => m.Value) .ToArray(); string result = String.Join(" ", matches); </code></pre> <code>result</code> = <code>'Test Word asdf Dasdf ABC Def'</code> <hr> <h3>How it works</h3> In the example string: <pre class="prettyprint lang-none prettyprint-override"><code>TestWord qwerDasdf ABCTest Testing ((*&^%$CamelCase!"£$%^^)) asdfAasdf AaBbbCD </code></pre> <code>[A-Z][a-z]+</code> matches: <ul> <li>[0-4] <code>Test</code> </li> <li>[4-8] <code>Word</code> </li> <li>[13-18] <code>Dasdf</code> </li> <li>[22-26] <code>Test</code> </li> <li>[27-34] <code>Testing</code> </li> <li>[45-50] <code>Camel</code> </li> <li>[50-54] <code>Case</code> </li> <li>[68-73] <code>Aasdf</code> </li> <li>[74-76] <code>Aa</code> </li> <li>[76-79] <code>Bbb</code> </li> </ul> <code>[A-Z]+(?=[A-Z][a-z])</code> matches: <ul> <li>[19-22] <code>ABC</code> </li> </ul> <code>[a-z]+</code> matches: <ul> <li>[9-13] <code>qwer</code> </li> <li>[64-68] <code>asdf</code> </li> </ul> <code>[A-Z]+</code> matches: <ul> <li>[79-81] <code>CD</code> </li> </ul>

Here is my attempt: <pre class="prettyprint"><code>(?<!^|\b|\p{Lu})\p{Lu}+(?=\p{Ll}|\b)|(?<!^\p{Lu}*|\b)\p{Lu}(?=\p{Ll}|(?<!\p{Lu}*)\b) </code></pre> This regex can be used with <code>Regex.Replace</code> and <code> $0</code> as a replacement string. <pre class="prettyprint"><code>Regex.Replace(value, @"(?<!^|\b|\p{Lu})\p{Lu}+(?=\p{Ll}|\b)|(?<!^\p{Lu}*|\b)\p{Lu}(?=\p{Ll}|(?<!\p{Lu}*)\b)", " $0", RegexOptions.Compiled); </code></pre> See demo Regex Explanation: <ul> <li>Contains 2 alternatives to account for a chain of capital letters before or after lowercase letters.</li> <li> <code>(?<!^|\b|\p{Lu})\p{Lu}+(?=\p{Ll}|\b)</code> - first alternative that matches several uppercase letters that are not preceded with a start of string, word boundary or another uppercase letter, and that are followed by a lowercase letter or a word boundary, </li> <li> <code>(?<!^\p{Lu}*|\b)\p{Lu}(?=\p{Ll}|(?<!\p{Lu}*)\b)</code> - the second alternative that matches a single capital letter that is not preceded with a start of string with optional uppercase letters right after, or word boundary and is followed by a lowercase letter or a word boundary that is not preceded by optional uppercase letters.</li> </ul>

Ignore existing spaces in converting CamelCase to string with spaces

Tags:

c#

.net

regex

I want to split camelCase or PascalCase words to space separate collection of words.

So far, I have:

Regex.Replace(value, @"(\B[A-Z]+?(?=[A-Z][^A-Z])|\B[A-Z]+?(?=[^A-Z]))", " $0", RegexOptions.Compiled);

It works fine for converting "TestWord" to "Test Word" and for leaving single words untouched, e.g. Testing remains Testing.

However, ABCTest gets converted to A B C Test when I would prefer ABC Test.

922

asked Jun 05 '15 08:06

Ciaran Martin

2 Answers

Try:

[A-Z][a-z]+|[A-Z]+(?=[A-Z][a-z])|[a-z]+|[A-Z]+

An example on Regex101

How is it used in CS?

string strText = " TestWord asdfDasdf  ABCDef";
        
string[] matches = Regex.Matches(strText, @"[A-Z][a-z]+|[A-Z]+(?=[A-Z][a-z])|[a-z]+|[A-Z]+")
                .Cast<Match>()
                .Select(m => m.Value)
                .ToArray();
            
string result = String.Join(" ", matches);

result = 'Test Word asdf Dasdf ABC Def'

How it works

In the example string:

TestWord qwerDasdf
ABCTest Testing    ((*&^%$CamelCase!"£$%^^))
asdfAasdf
AaBbbCD

[A-Z][a-z]+ matches:

[0-4] Test
[4-8] Word
[13-18] Dasdf
[22-26] Test
[27-34] Testing
[45-50] Camel
[50-54] Case
[68-73] Aasdf
[74-76] Aa
[76-79] Bbb

[A-Z]+(?=[A-Z][a-z]) matches:

[19-22] ABC

[a-z]+ matches:

[9-13] qwer
[64-68] asdf

[A-Z]+ matches:

[79-81] CD

184

answered Nov 06 '22 00:11

thodic

Here is my attempt:

(?<!^|\b|\p{Lu})\p{Lu}+(?=\p{Ll}|\b)|(?<!^\p{Lu}*|\b)\p{Lu}(?=\p{Ll}|(?<!\p{Lu}*)\b)

This regex can be used with Regex.Replace and $0 as a replacement string.

Regex.Replace(value, @"(?<!^|\b|\p{Lu})\p{Lu}+(?=\p{Ll}|\b)|(?<!^\p{Lu}*|\b)\p{Lu}(?=\p{Ll}|(?<!\p{Lu}*)\b)", " $0", RegexOptions.Compiled);

See demo

Regex Explanation:

Contains 2 alternatives to account for a chain of capital letters before or after lowercase letters.
(?<!^|\b|\p{Lu})\p{Lu}+(?=\p{Ll}|\b) - first alternative that matches several uppercase letters that are not preceded with a start of string, word boundary or another uppercase letter, and that are followed by a lowercase letter or a word boundary,
(?<!^\p{Lu}*|\b)\p{Lu}(?=\p{Ll}|(?<!\p{Lu}*)\b) - the second alternative that matches a single capital letter that is not preceded with a start of string with optional uppercase letters right after, or word boundary and is followed by a lowercase letter or a word boundary that is not preceded by optional uppercase letters.

answered Nov 06 '22 00:11

Wiktor Stribiżew

Related questions
                            
                                What is the motivation of C# ExpressionVisitor's implementation?
                            
                                Generic message handlers
                            
                                Read web.config from library consumed by the webapplicaion deployed using IIS
                            
                                .NET Garbagecollector trouble. Blocks for 15-40 mins
                            
                                Manual indexing over jagged arrays?
                            
                                How to order a list of entities by a custom order?
                            
                                3 tier application with Identity and EF
                            
                                Determine the NuGet package for a specific namespace
                            
                                DataTable Compute Value is too large or too small for type Int32
                            
                                Microsoft ASP.NET Identity - Multiple Users with the same name
                            
                                Understanding the future of the .NET framework
                            
                                c# - chromedriver - ignore-certificate-errors
                            
                                GZipStream complains magic number in header is not correct
                            
                                Can I re-generate random values in AutoFixture using a seed?
                            
                                Quartz.Net Scheduler to Run as a Windows Services
                            
                                Unity Editor - DrawDefaultInspector is not working
                            
                                asp.net mvc azure AAD authentication infinite loop
                            
                                ILogger.Log method declaringType parameter
                            
                                Deserializing Json String into multiple Object types
                            
                                How does OAuth with OWIN works in MVC5?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With