Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to negate any regular expression in Java

I have a regular expression which I want to negate, e.g.

/(.{0,4})

which String.matches returns the following

"/1234" true
"/12" true
"/" true
"" false
"1234" false
"/12345" false

Is there a way to negate (using regx only) to the above so that the results are:

"/1234" false
"/12" false
"/" false
"" true
"1234" true
"/12345" true

I'm looking for a general solution that would work for any regx without re-writing the whole regex.

I have looked at the following How to negate the whole regex? using (?! pattern), but that doesn't seem to work for me.

The following regx

(?!/(.{0,4}))

returns the following:

"/1234" false
"/12" false
"/" false
"" true
"1234" false
"/12345" false

which is not what I want. Any help would be appreciated.

like image 401
Wayne Avatar asked Dec 22 '11 23:12

Wayne


People also ask

How do you negate an expression in Java?

The not operator is a logical operator, represented in Java by the ! symbol. It's a unary operator that takes a boolean value as its operand. The not operator works by inverting (or negating) the value of its operand.

What is negated set in regex?

We come across negated character classes in Python regular expressions. An regex of '[abdfgh]' matches any single character which is one of 'a', 'b', 'd', 'f', 'g' or 'h'. This is termed a character class. An regex of '[^abdfgh]' will match any single character which is NOT one of 'a', 'b', 'd', 'f', 'g' or 'h'.

How can you negate characters in a set?

Negated Character Classes If you don't want a negated character class to match line breaks, you need to include the line break characters in the class. [^0-9\r\n] matches any character that is not a digit or a line break.


1 Answers

You need to add anchors. The original regex (minus the unneeded parentheses):

/.{0,4}

...matches a string that contains a slash followed by zero to four more characters. But, because you're using the matches() method it's automatically anchored, as if it were really:

^/.{0,4}$

To achieve the inverse of that, you can't rely on automatic anchoring; you have to make at least the end anchor explicit within the lookahead. You also have to "pad" the regex with a .* because matches() requires the regex to consume the whole string:

(?!/.{0,4}$).*

But I recommend that you explicitly anchor the whole regex, like so:

^(?!/.{0,4}$).*$

It does no harm, and it makes your intention perfectly clear, especially to people who learned regexes from other flavors like Perl or JavaScript. The automatic anchoring of the matches() method is highly unusual.

like image 121
Alan Moore Avatar answered Sep 19 '22 15:09

Alan Moore