Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to match any uppercase letter followed by the corresponding lower case letter?

Tags:

java

regex

I have a requirement that says a name must not start with 3 identical letters ignoring their case. A name starts with an upper case letter followed by lower case letters.

Basically I could convert the whole name to upper case and then match with a regex like (\p{Lu})\1{3,}.*.

But I was wondering if there exists a regex that matches the above requirements and does not need any preprocessing of the string to be matched. So what regex can I use to match strings like Aa, Dd or Uu without explicitly specifiying any possible combination?

EDIT:
I accepted Markos answer. I just needed to fix it to work with names of length 1 and two and anchor it at the beginning. So the actual regex for my use case is ^(\p{Lu})(\p{Ll}?$|(?=\p{Ll}{2})(?i)(?!(\1){2})).

I also upvoted the answers of Evgeniy and sp00m for helping me to learn a lesson in regexes.

Thanks for your efforts.

like image 709
SpaceTrucker Avatar asked Apr 24 '13 08:04

SpaceTrucker


People also ask

How do you convert uppercase letters to lowercase?

Using the lower() method One way to convert all the upper case letters into lower case using the inbuilt method lower() of the string library. This method converts all the characters present in a string to lowercase regardless of the character is uppercase or lowercase.

What is uppercase and lowercase letter in password Example Example?

Uppercase characters (A-Z) Lowercase characters (a-z) Digits (0-9)

What are the examples of uppercase and lowercase characters?

Alternatively known as caps and capital, and sometimes abbreviated as UC, uppercase is a typeface of larger characters. For example, typing a, b, and c shows lowercase, and typing A, B, and C shows uppercase. To type in uppercase, you can use either the Caps Lock key or the Shift key on the keyboard.

How do you match a capital letter in regex?

Using character sets For example, the regular expression "[ A-Za-z] " specifies to match any single uppercase or lowercase letter. In the character set, a hyphen indicates a range of characters, for example [A-Z] will match any one capital letter. In a character set a ^ character negates the following characters.


2 Answers

I admit to rising on the shoulders of giants (the other posters here), but this solution actually works for your use case:

final String[] strings = { "Aba", "ABa", "aba", "aBa", "Aaa", "Aab" }; 
final Pattern p = Pattern.compile("(\\p{Lu})(?=\\p{Ll}{2})(?i)(?!(\\1){2})");
for (String s : strings) System.out.println(s + ": " + p.matcher(s).find());

Now we have:

  1. a match for one upcase char at front;
  2. a lookahead assertion of two lowcase chars following;
  3. another lookahead that asserts these two chars are not both the same (ignoring case) as the first one.

Output:

Aba: true
ABa: false
aba: false
aBa: false
Aaa: false
Aab: true
like image 89
Marko Topolnik Avatar answered Sep 24 '22 23:09

Marko Topolnik


try

    String regex = "(?i)(.)(?=\\p{javaLowerCase})(?<=\\p{javaUpperCase})\\1";
    System.out.println("dD".matches(regex));
    System.out.println("dd".matches(regex));
    System.out.println("DD".matches(regex));
    System.out.println("Dd".matches(regex));

output

false
false
false
true
like image 30
Evgeniy Dorofeev Avatar answered Sep 21 '22 23:09

Evgeniy Dorofeev