Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Matching at least one lower case letter AND at least one upper case letter

I am trying to extract words [a-zA-Z]+ with one constraint: a word must contain at least one lower case letter AND at least one upper case letter (in any position within the word). Example: if input is hello 123 worLD, the only match should be worLD.

I tried to use positive lookaheads like this:

echo "hello 123 worLD" | grep -oP "(?=.*[a-z])(?=.*[A-Z])[a-zA-Z]+"
hello

This is not correct: the only match is hello instead of worLD. Then I tried this:

echo "hello 123 worLD" | grep -oP "\K((?=.*[a-z])(?=.*[A-Z])[a-zA-Z]+)"
hello
worLD

This is still incorrect: hello should not be matched.

like image 539
usual me Avatar asked Aug 11 '16 13:08

usual me


2 Answers

The .* in the lookaheads checks for the letter presence not only in the adjacent word, but later in the string. Use [a-zA-Z]*:

echo "hello 123 worLD" | grep -oP "\\b(?=[A-Za-z]*[a-z])(?=[A-Za-z]*[A-Z])[a-zA-Z]+"

See the demo online

I also added a word boundary \b at the start so that the lookahead check was only performed after a word boundary.

like image 200
Wiktor Stribiżew Avatar answered Oct 18 '22 13:10

Wiktor Stribiżew


Answer:

echo "hello 123 worLD" | grep -oP "\b(?=[A-Z]+[a-z]|[a-z]+[A-Z])[a-zA-Z]*"

Demo: https://ideone.com/HjLH5o

Explanation:

First check if word starts with one or more uppercase letters followed by one lowercase letters or vice versa followed by any number of lowercase and uppercase letters in any order.

Performance:

This solution takes 31 steps to reach the match on the provided test string, while the accepted solution takes 47 steps.

like image 20
fabianegli Avatar answered Oct 18 '22 11:10

fabianegli