Say, if I have a DN string, something like this:
OU=Karen,OU=Office,OU=admin,DC=corp,DC=Fabrikam,DC=COM
How to make a regular expression to pick only DNs that have both OU=Karen
and OU=admin
?
This is the regex lookahead solution, matching the whole string if it contains required parts in any order just for the reference. If you do not store the pattern in some sort of configurable variable, I'd stick with nhahtdh's solution, though.
/^(?=.*OU=Karen)(?=.*OU=admin).*$/
^ - line start
(?= - start zero-width positive lookahead
.* - anything or nothing
OU=Karen - literal
) - end zero-width positive lookahead
- place as many positive or negative look-aheads as required
.* - the whole line
$ - line end
You realise you don't have to do everything with a single regex, or even one regex.
Regular expressions are very good for catching classes of input but, if you have two totally fixed strings, you can just use a contains()
-type method for both of them and then and
the results.
Alternatively, if you need to use regexes, you can do that twice (once per string) and and
the results together.
If you need to do it with a single regex, you could try something like:
,OU=Karen,.*,OU=admin,|,OU=admin,.*,OU=Karen,
but you'll then have to also worry about when those stanzas appear at the start or end of the line, and all sorts of other edge cases (one or both at start or end, both next to each other, names like Karen7
or administrator-lesser
, and so on).
Having to allow for all possibilities will probably end up with something monstrous like:
^OU=Karen(,[^,]*)*,OU=admin,|
^OU=Karen(,[^,]*)*,OU=admin$|
,OU=Karen(,[^,]*)*,OU=admin,|
,OU=Karen(,[^,]*)*,OU=admin$|
^OU=admin(,[^,]*)*,OU=Karen,|
^OU=admin(,[^,]*)*,OU=Karen$|
,OU=admin(,[^,]*)*,OU=Karen,|
,OU=admin(,[^,]*)*,OU=Karen$
although, with an advanced enouge regex engine, this may be reducible to something smaller (although it would be unlikely to be any faster, simply because of all the forward-looking/back-tracking).
One way that could be improved without a complex regex is to massage your string slightly before-hand so that boundary checks aren't needed:
newString = "," + origString.replace (",", ",,") + ","
so that it starts and ends with a comma and all commas within it are duplicated:
,OU=Karen,,OU=Office,,OU=admin,,DC=corp,,DC=Fabrikam,,DC=COM,
Then you need only check for the much simpler:
,OU=Karen,.*,OU=admin,|,OU=admin,.*,OU=Karen,
and this removes all the potential problems mentioned:
Karen2
being matched accidentally.Probably the best way to do this (if your language allows) is to simply split the string on commas and examine them, something like:
str = "OU=Karen,OU=Office,OU=admin,DC=corp,DC=Fabrikam,DC=COM"
elems[] = str.splitOn(",")
gotKaren = false
gotAdmin = false
for each elem in elems:
if elem = "OU=Karen": gotKaren = true
if elem = "OU=admin": gotAdmin = true
if gotKaren and gotAdmin:
weaveYourMagicHere()
This both ignores the order in which they may appear and bypasses any regex "gymnastics" that may be required to detect the edge cases.
It also has the advantage of probably being more readable than the equivalent regex :-)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With