Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex for PascalCased words (aka camelCased with leading uppercase letter)

How do I find all PascalCased words in a document with a regular expression?

If you don't know the word Pascal cased, I'm only concerned with leading Upper camel case (i.e., camel cased words in which the first letter is capitalized).

like image 972
Tom Lehman Avatar asked Jul 14 '09 21:07

Tom Lehman


People also ask

Does capitalization matter regex?

For example, the regular expression "[ A-Za-z] " specifies to match any single uppercase or lowercase letter. In the character set, a hyphen indicates a range of characters, for example [A-Z] will match any one capital letter.

What is camel case letters?

What is CamelCase? CamelCase is a way to separate the words in a phrase by making the first letter of each word capitalized and not using spaces. It is commonly used in web URLs, programming and computer naming conventions. It is named after camels because the capital letters resemble the humps on a camel's back.


2 Answers

([A-Z][a-z0-9]+)+ 

Assuming English. Use appropriate character classes if you want it internationalizable. This will match words such as "This". If you want to only match words with at least two capitals, just use

([A-Z][a-z0-9]+){2,} 

UPDATE: As I mentioned in a comment, a better version is:

[A-Z]([A-Z0-9]*[a-z][a-z0-9]*[A-Z]|[a-z0-9]*[A-Z][A-Z0-9]*[a-z])[A-Za-z0-9]* 

It matches strings that start with an uppercase letter, contain only letters and numbers, and contain at least one lowercase letter and at least one other uppercase letter.

like image 148
Adam Crume Avatar answered Oct 06 '22 10:10

Adam Crume


Lower camel case

this regex includes number and implements strict lower camel case as defined by the Google Java Style Guide regex validation.

[a-z]+((\d)|([A-Z0-9][a-z0-9]+))*([A-Z])? 
  1. The first character is lower case.
  2. The following elements are either a single number or a upper case character followed by lower cases characters.
  3. The last character can be an upper case one.

Here is a snippet illustrating this regex. The following elements are valid.

xmlHttpRequest newCustomerId innerStopwatch supportsIpv6OnIos youTubeImporter youtubeImporter affine3D 

Upper camel case

Same principle as the one used for lower camel case with always a starting upper case character.

([A-Z][a-z0-9]+)((\d)|([A-Z0-9][a-z0-9]+))*([A-Z])? 

Here is a snippet illustrating this regex. The following elements are valid.

XmlHttpRequest NewCustomerId InnerStopwatch SupportsIpv6OnIos YouTubeImporter YoutubeImporter Affine3D 
like image 27
Nicolas Henneaux Avatar answered Oct 06 '22 10:10

Nicolas Henneaux