For example:
thisIsMySample
should be:
this_Is_My_Sample
My code:
System.Text.RegularExpressions.Regex.Replace(input, "([A-Z])", "_$0", System.Text.RegularExpressions.RegexOptions.Compiled);
It works fine, but if the input is changed to:
ThisIsMySample
the output will be:
_This_Is_My_Sample
How can first occurrence be ignored?
Non-Regex solution
string result = string.Concat(input.Select((x,i) => i > 0 && char.IsUpper(x) ? "_" + x.ToString() : x.ToString()));
Seems to be quite fast too: Regex: 2569ms, C#: 1489ms
Stopwatch stp = new Stopwatch();
stp.Start();
for (int i = 0; i < 1000000; i++)
{
string input = "ThisIsMySample";
string result = System.Text.RegularExpressions.Regex.Replace(input, "(?<=.)([A-Z])", "_$0",
System.Text.RegularExpressions.RegexOptions.Compiled);
}
stp.Stop();
MessageBox.Show(stp.ElapsedMilliseconds.ToString());
// Result 2569ms
Stopwatch stp2 = new Stopwatch();
stp2.Start();
for (int i = 0; i < 1000000; i++)
{
string input = "ThisIsMySample";
string result = string.Concat(input.Select((x, j) => j > 0 && char.IsUpper(x) ? "_" + x.ToString() : x.ToString()));
}
stp2.Stop();
MessageBox.Show(stp2.ElapsedMilliseconds.ToString());
// Result: 1489ms
You can use a lookbehind to ensure that each match is preceded by at least one character:
System.Text.RegularExpressions.Regex.Replace(input, "(?<=.)([A-Z])", "_$0",
System.Text.RegularExpressions.RegexOptions.Compiled);
lookaheads and lookbehinds allow you to make assertions about the text surrounding a match without including that text within the match.
Maybe like;
var str = Regex.Replace(input, "([A-Z])", "_$0", RegexOptions.Compiled);
if(str.StartsWith("_"))
str = str.SubString(1);
// (Preceded by a lowercase character or digit) (a capital) => The character prefixed with an underscore
var result = Regex.Replace(input, "(?<=[a-z0-9])[A-Z]", m => "_" + m.Value);
result = result.ToLowerInvariant();
PascalCase
and camelCase
.__HiThere_Guys
becomes __hi_there_guys
.NewVersion3
becomes new_version3
.3VersionsHere
becomes 3_versions_here
, but 3rdVersion
becomes 3rd_version
.IDNumber
, where ID
would be considered a separate word), as suggested in Microsoft's Capitalization Conventions, are not supported, since they conflict with other cases. I recommend, in general, to resist this guideline, as it is a seemingly arbitrary exception to the convention of not capitalizing acronyms. Stick with IdNumber
.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With