Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to include certain words but exclude another

Tags:

regex

I am trying to write a regular expression which will check a URL contains certain words and excludes another.

The reason for this is I am trying to track traffic moving through my website and I don't want to count anyone who hits the Thank You page.

So for example:

  • http://www.mywebsite.com/register-now/ - MATCH
  • http://www.mywebsite.com/contact-us/ - MATCH
  • http://www.mywebsite.com/register-now/thank-you - NO MATCH
  • http://www.mywebsite.com/contact-us/thank-you - NO MATCH
  • http://www.mywebsite.com/thank-you - NO MATCH

I have 2 words (register-now and contact-us) these must be in the URL. However I must ensure that 1 word (thank-you) is also not in the URL.

I have tried to use a negative lookahead to check that the URL does NOT contain thank-you but It is not working:

"^(?!.*\/thank\-you+)\/(contact\-us|register\-now)\/.*"
like image 683
Javacadabra Avatar asked Oct 18 '16 10:10

Javacadabra


People also ask

How do you exclude words in regex?

To represent this, we use a similar expression that excludes specific characters using the square brackets and the ^ (hat). For example, the pattern [^abc] will match any single character except for the letters a, b, or c.

What does ?= * Mean in regex?

. means match any character in regular expressions. * means zero or more occurrences of the SINGLE regex preceding it.

What does \f mean in regex?

Definition and Usage The \f metacharacter matches form feed characters.

How do you regex only words?

To run a “whole words only” search using a regular expression, simply place the word between two word boundaries, as we did with ‹ \bcat\b ›. The first ‹ \b › requires the ‹ c › to occur at the very start of the string, or after a nonword character.


1 Answers

In a single regex you can use negative lookahead:

^(?!.*\/thank-you(?:\/|$))(?:.*\/)?(?:contact-us|register-now)\/

RegEx Demo

  • (?!.*\/thank-you(?:\/|$)) is negative lookahead that will fail the match if URL has /thanks-you or /thank-you/.
  • Use MULTILINE mode if your text contains multiple URLs in each line.
like image 146
anubhava Avatar answered Nov 03 '22 10:11

anubhava