I am trying to match a string which does not contain a substring
My string always starts "http://www.domain.com/"
The substring I want to exclude from matches is ".a/" which comes after the string (a folder name in the domain name)
There will be characters in the string after the substring I want to exclude
For example:
"http://www.domain.com/.a/test.jpg" should not be matched
But "http://www.domain.com/test.jpg" should be
Use a negative lookahead assertion as:
^http://www\.domain\.com/(?!\.a/).*$
Rubular Link
The part (?!\.a/)
fails the match if the URL is immediately followed with a .a/
string.
My advise in such cases is not to construct overly complicated regexes whith negative lookahead assertions or such stuff.
Keep it simple and stupid!
Do 2 matches, one for the positives, and sort out later the negatives (or the other way around). Most of the time, the regexes become easier, if not trivial.
And your program gets clearer.
For example, to extract all lines with foo, but not foobar, I use:
grep foo | grep -v foobar
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With