I am trying to extract gmail.com from a passage where I want only those string match that don't start with @.
Example: [email protected] (don't match this); www.gmail.com (match this)
I tried the following: (?!@)gmail\.com but this did not work. This is matching both the cases highlighted in the example above. Any suggestions?
You want a negative lookbehind if your regex supports it, like (?<!@)gmail\.com and add \bs to avoid matching foogmail.comz, like: (?<!@)\bgmail\.com\b
[^@\s]*(?<!@)\bgmail\.com\b
assuming you want to find strings in a longer text body, not validate entire strings.
Explanation:
[^@\s]* # match any number of non-@, non-space characters
(?<!@) # assert that the previous character isn't an @
\b # match a word boundary (so we don't match hogmail.com)
gmail\.com # match gmail.com
\b # match a word boundary
On a first glance, the (?<!@) lookbehind assertion appears unnecessary, but it isn't - otherwise the gmail.com part of [email protected] would match.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With