Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I match two capital letters together, that aren't preceded by special characters, using regex?

I've read through a lot of really interesting stuff recently about regex. Especially about creating your own regex boundaries

One thing that I don't think I've seen done (I'm 100% it has been done, but I haven't noticed any examples) is how to exclude a regex match if it's preceded by a 'special character', such as & ! % $ #. For example:

If I use the regex (Note this is from C#)

([A-Z]{2,}\\b)

It will match any capital letters that are two or more in length, and use the \b boundary to make sure the two capital letters don't start with or end with any other letters. But here's where I'm not sure how this would behave:

AA -Match

sAB -No Match

ACs -No Match

!AD -Match

AF! -Match

I would like to know how to select only two or more capital letters that aren't preceded by a lower case letter/number/symbol, or followed by a lower case letter/number/special characters.

I've seen people use spaces, so make sure the string starts with or ends with a space, but that doesn't work if it's at the beginning or end of a line.

So, the output I would look for from the example above would be:

AA -Match

sAB -No Match

ACs -No Match

!AD -No Match

AF! -No Match

Any help is appreciated.

like image 933
trueCamelType Avatar asked Oct 19 '22 19:10

trueCamelType


2 Answers

You just need to use a lookbehind and a lookahead:

(?<![a-z\d!@#$%^&*()])[A-Z]{2,}(?![a-z\d!@#$%^&*()])

See regex demo

The (?<![a-z\d!@#$%^&*()]) lookbehind makes sure there is no lowercase letters ([a-z]), digits (\d), or special characters that you defined. If there is one, the match is failed, nothing is returned.

The (?![a-z\d!@#$%^&*()]) lookahead also fails a match if the same characters are found after the ALLCAPS letters.

See more details on Lookahead and Lookbehind Zero-Length Assertions here.

like image 187
Wiktor Stribiżew Avatar answered Oct 31 '22 15:10

Wiktor Stribiżew


I think it's enough to just precede the pattern you have with a negation of lower case letter and any symbols you want to exclude. My example only excludes !, but you can add to the list as appropriate. ^ inside brackets negates what is inside them. So, for example, you can incorporate the pattern

/[^a-z!][A-Z]{2,}[^a-z!]/g
like image 41
Matt Cremeens Avatar answered Oct 31 '22 15:10

Matt Cremeens