Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java RegEx that matches anything BUT literal string 'NIL' or 'nil'

Tags:

java

regex

null

OK, guys. Here's a Java interview-type question that seems to have stumped some very smart people around here. They actually need this for production code, so it's more than just an interview puzzler.

They need a regular expression, in Java, that returns true if a string literal is anything other than the 3-letter word NIL. The test needs to be case insensitive, and the RegEx itself must do all the work.

So, the RegEx should reject NIL, nil, NiL, nIL, and so on.

It should, however, accept: nile, anil, will, zappa-nil-a, and the empty string.

How many Java developers does it take to write a trivial RegEx? Apparently a lot!

like image 329
Armchair Bronco Avatar asked Apr 20 '12 23:04

Armchair Bronco


People also ask

What does \\ mean in Java regex?

The backslash \ is an escape character in Java Strings. That means backslash has a predefined meaning in Java. You have to use double backslash \\ to define a single backslash. If you want to define \w , then you must be using \\w in your regex.

How do I allow all items in regex?

Throw in an * (asterisk), and it will match everything. Read more. \s (whitespace metacharacter) will match any whitespace character (space; tab; line break; ...), and \S (opposite of \s ) will match anything that is not a whitespace character.

What is $1 regex Java?

The replacement string $1 means "group 1" (the first group made by a set of brackets). This regex matches from the first non-digit encountered to the end and just deletes it (replaces with nothing).

Does regex only work with strings?

So, yes, regular expressions really only apply to strings. If you want a more complicated FSM, then it's possible to write one, but not using your local regex engine.


1 Answers

You can do this using a negative lookahead.

With case-insensitive option enabled:

^(?!nil$).*

You could leave off the .* at the end if you don't need to actually return the string in the match. Here is a version without the case-insensitive option:

^(?![nN][iI][lL]$).*

Explanation:

^       # start of string anchor
(?!     # start negative lookahead (fail if...)
   nil    # literal characters 'nil'
   $      # end of string
)       # end lookahead
.*      # consume string (not necessary, but it acts more like a typical regex)

If you want the regex to match nil\n, then use \z instead of $ in the lookahead: ^(?!nil\z).*

like image 60
Andrew Clark Avatar answered Sep 22 '22 11:09

Andrew Clark