Need a Regex string to work with custom Exchange DLP "Sensitive Information" type.
i.e match on Smith but not if John Smith or Smith John
(?i)(?<!John\s)Smith appears to work for "John Smith" though I'm not convinced it is 100% efficient.
(?i)(Smith.*\s(?!John)) appears to work for "Smith John" but not if followed by a space or new line.
Have tried the following to combine them into one string but it doesn't seem to work at all.
(?i)(?<!John\s)Smith |(?i)(Smith.*\s(?!John))
(?i)(?<!John\s)Smith.*\s(?!John)
What schoolboy error am I making?
The (?i)(?<!John\s)Smith |(?i)(Smith.*\s(?!John)) pattern is matching Smith that does not have John+ 1 whitespace before it, OR a Smith that is followed with any amount of chars followed with a whitespace that is not immediately followed with John. Thus, it matches Smith in a lot of positions.
The (?i)(?<!John\s)Smith.*\s(?!John) pattern grabs a Smith that is not immediately preceded with John + whitespace, and all text up to the final whitespace that is not immediately followed with John.
Make sure the \s pattern is inside the lookahead:
(?i)(?<!John\s)Smith(?!\s+John)
See the regex demo
Details
(?i) - case insensitive inline modifier(?<!John\s) - a location that is not immediately preceded with Hohn and a whitespace charSmith - a literal substring(?!\s+John) - the Smith substring should not be immediately followed with 1+ whitespaces (or if you use \s*, with 0+ whitespaces) and the substring John.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With