I am searching the following words in .todo files:
ZshTabCompletionBackward
MacTerminalIterm
I made the following regex
[A-Z]{1}[a-z]*[A-Z]{1}[a-z]*
However, it is not enough, since it finds only the following type of words
ZshTab
In pseudo code, I am trying to make the following regex
([A-Z]{1}[a-z]*[A-Z]{1}[a-z]*){1-9}
How can you make the above regex in Perl?
I think you want something like this, written with the /x flag to add comments and insignificant whitespace:
/
\b # word boundary so you don't start in the middle of a word
( # open grouping
[A-Z] # initial uppercase
[a-z]* # any number of lowercase letters
) # end grouping
{2,} # quantifier: at least 2 instances, unbounded max
\b # word boundary
/x
If you want it without the fancy formatting, just remove the whitespace and comments:
/\b([A-Z][a-z]*){2,}\b/
As j_random_hacker points out, this is a bit simple since it will match a word that is just consecutive capital letters. His solution, which I've expanded with /x to show some detail, ensures at least one lowercase letter:
/
\b # start at word boundary
[A-Z] # start with upper
[a-zA-Z]* # followed by any alpha
(?: # non-capturing grouping for alternation precedence
[a-z][a-zA-Z]*[A-Z] # next bit is lower, any zero or more, ending with upper
| # or
[A-Z][a-zA-Z]*[a-z] # next bit is upper, any zero or more, ending with lower
)
[a-zA-Z]* # anything that's left
\b # end at word
/x
If you want it without the fancy formatting, just remove the whitespace and comments:
/\b[A-Z][a-zA-Z]*(?:[a-z][a-zA-Z]*[A-Z]|[A-Z][a-zA-Z]*[a-z])[a-zA-Z]*\b/
I explain all of these features in Learning Perl.
Assuming you aren't using the regex to do extraction, and just matching...
[A-Z][a-zA-Z]*
Isn't the only real requirement that it's all letters and starts with a capital letter?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With