Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex for string containing one string, but not another [duplicate]

Have regex in our project that matches any url that contains the string "/pdf/":

(.+)/pdf/.+

Need to modify it so that it won't match urls that also contain "help"

Example:

Shouldn't match: "/dealer/help/us/en/pdf/simple.pdf" Should match: "/dealer/us/en/pdf/simple.pdf"

like image 755
Jacob Petersen Avatar asked Sep 06 '16 16:09

Jacob Petersen


People also ask

What does '$' mean in regex?

Literal Characters and Sequences For instance, you might need to search for a dollar sign ("$") as part of a price list, or in a computer program as part of a variable name. Since the dollar sign is a metacharacter which means "end of line" in regex, you must escape it with a backslash to use it literally.

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.


2 Answers

If lookarounds are supported, this is very easy to achieve:

(?=.*/pdf/)(?!.*help)(.+)

See a demo on regex101.com.

like image 55
Jan Avatar answered Sep 26 '22 00:09

Jan


(?:^|\s)((?:[^h ]|h(?!elp))+\/pdf\/\S*)(?:$|\s)

First thing is match either a space or the start of a line

(?:^|\s)

Then we match anything that is not a or h OR any h that does not have elp behind it, one or more times +, until we find a /pdf/, then match non-space characters \S any number of times *.

((?:[^h ]|h(?!elp))+\/pdf\/\S*)

If we want to detect help after the /pdf/, we can duplicate matching from the start.

((?:[^h ]|h(?!elp))+\/pdf\/(?:[^h ]|h(?!elp))+)

Finally, we match a or end line/string ($)

(?:$|\s)

The full match will include leading/trailing spaces, and should be stripped. If you use capture group 1, you don't need to strip the ends.

Example on regex101

like image 35
TemporalWolf Avatar answered Sep 26 '22 00:09

TemporalWolf