Have regex in our project that matches any url that contains the string "/pdf/":
(.+)/pdf/.+
Need to modify it so that it won't match urls that also contain "help"
Example:
Shouldn't match: "/dealer/help/us/en/pdf/simple.pdf" Should match: "/dealer/us/en/pdf/simple.pdf"
Literal Characters and Sequences For instance, you might need to search for a dollar sign ("$") as part of a price list, or in a computer program as part of a variable name. Since the dollar sign is a metacharacter which means "end of line" in regex, you must escape it with a backslash to use it literally.
Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.
If lookarounds are supported, this is very easy to achieve:
(?=.*/pdf/)(?!.*help)(.+)
See a demo on regex101.com.
(?:^|\s)((?:[^h ]|h(?!elp))+\/pdf\/\S*)(?:$|\s)
First thing is match either a space or the start of a line
(?:^|\s)
Then we match anything that is not a or
h
OR any h
that does not have elp
behind it, one or more times +
, until we find a /pdf/
, then match non-space characters \S
any number of times *
.
((?:[^h ]|h(?!elp))+\/pdf\/\S*)
If we want to detect help
after the /pdf/
, we can duplicate matching from the start.
((?:[^h ]|h(?!elp))+\/pdf\/(?:[^h ]|h(?!elp))+)
Finally, we match a or end line/string (
$
)
(?:$|\s)
The full match will include leading/trailing spaces, and should be stripped. If you use capture group 1, you don't need to strip the ends.
Example on regex101
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With