I'm not good at regex, trying to make 2 regex.
Regex1:
All specified words in any order but nothing else. (repetition allowed).
Regex2:
All specified words in any order but nothing else. (repetition not allowed).
Words:
aaa, bbb, ccc
Strings:
aaa ccc bbb
aaa ccc
aaa bbb ddd ccc
bbb aaa bbb ccc
Regex1 evaluate above strings as:
true -> all word present in any order
false -> bbb is missing
false -> unknown word 'ddd'
false -> repetition not allowed
Regex2 evaluate above strings as:
true -> all word present in any order
false -> bbb is missing
false -> unknown word 'ddd'
true -> all word present in any order and repetition is allowed
My Attempt
/^(?=.*\baaa\b)(?=.*\bbbb\b)(?=.*\bccc\b).*$/
Asking for learning purpose so please elaborate it.
Without repitition regex101
^(?:(aaa|bbb|ccc)(?!.*?\b\1) ?\b){3}$
And with repitition regex101
^(?=.*?\baaa)(?=.*?\bbbb)(?=.*?\bccc)(?:(aaa|bbb|ccc) ?\b)+$
Two more ideas. Regex explanation at regex101 on the right side.
For Regex 1:
var re = /^(?=.*?\baaa\b)(?=.*?\bbbb\b)(?=.*?\bccc\b)\b(?:aaa|bbb|ccc)\b(?: +\b(?:aaa|bbb|ccc)\b)*$/;
var res = document.getElementById('result');
res.innerText += re.test('aaa ccc bbb');
res.innerText += ', ' + re.test('aaa ccc ddd');
res.innerText += ', ' + re.test('aaa ddd bbb');
res.innerText += ', ' + re.test('ccc bbb ccc');
<div id="result"></div>
Your code already does part of the trick. Your positive lookaheads check that all words appear somewhere, however not, that they are the only words present. To achieve this, I added the circumflex (^) at the beginning to detect the start of the string. Then, the non capturing group of \b(?:aaa|bbb|ccc)\b
, to detect the first instance of any word.
This is then followed by any number of words, preceded by at least one space (?:\s+\b(?:aaa|bbb|ccc)\b)*
, basically the same pattern, but with the \s+ in front, and wrapped in a *. And then we need the string to end somewhere. This is done with the dollar sign $
.
For Regex 2:
The basic strategy is the same. You would just check with a negative lookahead, that the matched string does not exist again:
//var re = /^(?=.*?\baaa\b)(?!.*?\baaa\b.*?\baaa\b)(?=.*?\bbbb\b)(?!.*?\bbbb\b.*?\bbbb\b)(?=.*?\bccc\b)(?!.*?\bccc\b.*?\bccc\b)\b(?:aaa|bbb|ccc)\b(?:\s+\b(?:aaa|bbb|ccc)\b)*$/;
// optimized version, see comments
var re = /^(?=.*?\baaa\b)(?=.*?\bbbb\b)(?=.*?\bccc\b)(?!.*?\b(\w+)\b.*?\b\1\b)\b(?:aaa|bbb|ccc)\b(?: +\b(?:aaa|bbb|ccc)\b)*$/;
var res = document.getElementById('result');
res.innerText += re.test('aaa ccc bbb');
res.innerText += ', ' + re.test('aaa ccc ddd');
res.innerText += ', ' + re.test('aaa bbb aaa');
res.innerText += ', ' + re.test('aaa ccc bbb ccc');
<div id="result"></div>
First, we have the positive lookahead (?=.*?\bword\b)
to see that word exists. We follow that by the negative lookahead (?!.*?\baaa\b.*?\baaa\b)
to see, the word does not exist multiple times. Repeat for all words. Presto!
Update: Instead of checking the specific words aren't repeated, we can also check that NO word is repeated by using the (?!.*?\b(\w+)\b.*?\b\1\b)
construct. This makes the regex more concise. Thanks to @revo for pointing it out.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With