Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex for non-consecutive uppercase with lowercase

Tags:

c++

regex

std::string s("AAA");
std::smatch m;
std::regex e("(?=.{3,}[A-Z])(?=.{0,}[a-z]).*");
output = std::regex_search(s, m, e);

Here it should have 3 or more uppercase letters and zero or more lowercase letter. However the output is zero meaning it fails. When I try replace zero with 1 and s("AAA4") it works fine.

So now I want it to allow zero or more but it seems like it is not accepting zero I even tried quantifier (*) which is equivalent to {0,} still not working.

Here is an example:

string1 "AAA"
string2 "AAAb"
string3 "AbAA"

The following regex works with string1 and string2 as the uppercase are consecutive:

[A-Z]{3,}[a-z]*

The following regex works with string2 and string3 but it will not work when there are no lowercase even though I specified 0.

(?=.{3,}[A-Z])(?=.{0,}[a-z]).*

What I am looking for is a regex to work with all of them with following cases:

  • Allow 0 or more occurrence of lowercase
  • Validate 3 uppercase in string but they dont have to be consecutive like string 3
like image 691
MHxy Avatar asked Jan 31 '26 23:01

MHxy


1 Answers

Here it should have 3 or more uppercase letters and zero or more lowercase letter.

Your (?=.{3,}[A-Z])(?=.{0,}[a-z]).* regex matches a part of a line consisting of 0+ chars (.*) that starts with any 3 or more chars followed with an uppercase ASCII letter ((?=.{3,}[A-Z])), and that has at least one lowercase ASCII letter ((?=.{0,}[a-z])).

To match a string that contains x uppercase ASCII letters and y lowercase ASCII letters, you need to use

std::regex e("^(?=(?:[^A-Z]*[A-Z]){x}[^A-Z]*$)(?=(?:[^a-z]*[a-z]){y}[^a-z]*$)");
                                   ^                              ^

See the regex demo

Details:

  • ^ - start of a string
  • (?=(?:[^A-Z]*[A-Z]){x}[^A-Z]*$) - Positive lookahead 1 that is triggered at the start of the string and checks if there are x sequences of
    • [^A-Z]* - zero or more chars other than uppercase ASCII letters
    • [A-Z] - an uppercase ASCII letter
    • [^A-Z]*$ - zero or more chars other than uppercase ASCII letters up to the end of the string ($)
  • (?=(?:[^a-z]*[a-z]){y}[^a-z]*$) - Positive lookahead 2 that is also triggered at the start of the string (as lookaheads are zero-width assertions) and checks if there are y sequences of
    • [^a-z]* - zero or more chars other than lowercase ASCII letters
    • [a-z] - an lowercase ASCII letter
    • [^a-z]*$ - zero or more chars other than lowercase ASCII letters up to the end of the string.
like image 57
Wiktor Stribiżew Avatar answered Feb 02 '26 14:02

Wiktor Stribiżew



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!