Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex to match a word with unique (non-repeating) characters

Tags:

regex

I'm looking for a regex that will match a word only if all its characters are unique, meaning, every character in the word appears only once.

Example:
abcdefg -> will return MATCH
abcdefgbh -> will return NO MATCH (because the letter b repeats more than once)

like image 452
Nir Alfasi Avatar asked Oct 13 '12 06:10

Nir Alfasi


People also ask

What does \+ mean in regex?

Example: The regex "aa\n" tries to match two consecutive "a"s at the end of a line, inclusive the newline character itself. Example: "a\+" matches "a+" and not a series of one or "a"s. ^ the caret is the anchor for the start of the string, or the negation symbol.

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.

Which method is used to match any non word character?

\D matches non-digits. \S matches non-spaces. \W matches non-word characters.


2 Answers

Try this, it might work,

^(?:([A-Za-z])(?!.*\1))*$ 

Explanation

Assert position at the beginning of a line (at beginning of the string or after a line break character) «^» Match the regular expression below «(?:([A-Z])(?!.*\1))*»    Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»    Match the regular expression below and capture its match into backreference number 1 «([A-Z])»       Match a single character in the range between “A” and “Z” «[A-Z]»    Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?!.*\1)»       Match any single character that is not a line break character «.*»          Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»       Match the same text as most recently matched by capturing group number 1 «\1» Assert position at the end of a line (at the end of the string or before a line break character) «$» 
like image 137
John Woo Avatar answered Oct 06 '22 07:10

John Woo


You can check whether there are 2 instances of the character in the string:

^.*(.).*\1.*$ 

(I just simply capture one of the character and check whether it has a copy elsewhere with back reference. The rest of .* are don't-cares).

If the regex above match, then the string has repeating character. If the regex above doesn't match, then all the characters are unique.

The good thing about the regex above is when the regex engine doesn't support look around.

Apparently John Woo's solution is a beautiful way to check for the uniqueness directly. It assert at every character that the string ahead will not contain the current character.

like image 35
nhahtdh Avatar answered Oct 06 '22 06:10

nhahtdh