I want to match a URL that contains any sequence of valid URL characters but not a particular word. The URL in question http://gateway.ovid.com and I want to match anything but the word 'gateway' so:
but
Something like the following:
^http://([a-z0-9\-\.]+|(?<!gateway))\.ovid\.com$
but it doesn't seem to work.
Update: Sorry forget to mention the language, it's C#.NET
How do you ignore something in regex? To match any character except a list of excluded characters, put the excluded charaters between [^ and ] . The caret ^ must immediately follow the [ or else it stands for just itself.
?= is a positive lookahead, a type of zero-width assertion. What it's saying is that the captured match must be followed by whatever is within the parentheses but that part isn't captured. Your example means the match needs to be followed by zero or more characters and then a digit (but again that part isn't captured).
The \b metacharacter matches at the beginning or end of a word.
[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9. (a-z0-9) -- Explicit capture of a-z0-9 .
Your regex is almost correct except the extra '|' after '+'. Remove the '|'
^http://([a-z0-9\-\.]+(?<!gateway))\.ovid\.com$
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With