Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex: convert camel case to all caps with underscores

Tags:

c#

regex

What regular expression can be used to make the following conversions?

City -> CITY
FirstName -> FIRST_NAME
DOB -> DOB
PATId -> PAT_ID
RoomNO -> ROOM_NO

The following almost works - it just adds an extra underscore to the beginning of the word:

var rgx = @"(?x)( [A-Z][a-z,0-9]+ | [A-Z]+(?![a-z]) )";

var tests = new string[] { "City",
                           "FirstName",
                           "DOB",
                           "PATId",
                           "RoomNO"};

foreach (var test in tests)
    Console.WriteLine("{0} -> {1}", test, 
                       Regex.Replace(test, rgx, "_$0").ToUpper());


// output:
// City -> _CITY
// FirstName -> _FIRST_NAME
// DOB -> _DOB
// PATId -> _PAT_ID
// RoomNO -> _ROOM_NO
like image 528
MCS Avatar asked Dec 22 '10 16:12

MCS


1 Answers

Flowing from John M Gant's idea of adding underscores then capitalizing, I think this regular expression should work:

([A-Z])([A-Z][a-z])|([a-z0-9])([A-Z])

replacing with:

$1$3_$2$4

You can rename the capture zones to make the replace string a little nicer to read. Only $1 or $3 should have a value, same with $2 and $4. The general idea is to add underscores when:

  • There are two capital letters followed by a lower case letter, place the underscore between the two capital letters. (PATId -> PAT_Id)
  • There is a small letter followed by a capital letter, place the underscore in the middle of the two. (RoomNO -> Room_NO and FirstName -> First_Name)

Hope this helps.

like image 155
John McDonald Avatar answered Sep 24 '22 04:09

John McDonald