Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding match by negating (based on a missing string)

Tags:

I've been googling a lot about this. What I'm trying to achieve is the next: I have to check in a regex condition if a MIME is NOT of a specific type.

For example, I have recevied the next message:

image/png, image/jpeg, document/pdf

I would like to detect the document/pdf part only , which is a MIME type, a string, that does NOT start with image/

But no matter how hard I looked, tried and played around with the RegExBody software, I just utterly fail to match it..

I'm posting this in despair and hopes that maybe an expert regex could help me out..

I tried many approaches, mainly: Finding out the non-image type, regardless if there is one or not.

It just refuses to work. I tried positive lookahead and negative lookahead. But I probably used it wrong somehow. I can't post the examples because I have tried and deleted so many. The one that seemed really close to working was \b(?:(?!image/\w+))\w+\b but it just persists on selecting the second part of the non-matching pattern. If I use: image/png It gets the: png Which means it would still return true although I meant it to ignore image/ types..

like image 392
SKYFALL26 Avatar asked May 08 '16 14:05

SKYFALL26


People also ask

What is ?! In regex?

It's a negative lookahead, which means that for the expression to match, the part within (?!...) must not match. In this case the regex matches http:// only when it is not followed by the current host name (roughly, see Thilo's comment).

What does * do in regex?

The Match-zero-or-more Operator ( * ) This operator repeats the smallest possible preceding regular expression as many times as necessary (including zero) to match the pattern. `*' represents this operator. For example, `o*' matches any string made up of zero or more `o' s.

How do you write not condition in regex?

The (?!...) part means "only match if the text following (hence: lookahead) this doesn't (hence: negative) match this. But it doesn't actually consume the characters it matches (hence: zero-width). lookbehind / lookahead : specifies if the characters before or after the point are considered.


1 Answers

You should have added a /\w+ part after \w+ to match your substring:

\b(?!image/)\w+/\w+\b

See the regex demo

Pattern details:

  • \b - word boundary
  • (?!image/) - a negative lookahead failing the match if there is image/ right after the current location
  • \w+/\w+ - 1+ word characters followed with / and again 1+ word characters
  • \b - a trailing word boundary
like image 117
Wiktor Stribiżew Avatar answered Oct 11 '22 15:10

Wiktor Stribiżew