Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Make regular expression not match empty string?

Tags:

regex

I have regex which works fine in my application, but it matches an empty string too, i.e. no error occurs when the input is empty. How do I modify this regex so that it will not match an empty string ? Note that I DON'T want to change any other functionality of this regex.

This is the regex which I'm using: ^([0-9\(\)\/\+ \-]*)$

I don't know a lot about regex formulation myself, which is why I'm asking. I have searched for an answer, but couldn't find a direct one. Closest I got to was this: regular expression for anything but an empty string in c#, but that doesn't really work for me ..

like image 773
Ahmad Avatar asked Oct 27 '13 16:10

Ahmad


People also ask

Does empty regex match everything?

An empty regular expression matches everything.

Can an empty set be in a regular expression?

∅, the empty set, is a regular expression. ∅ represent the language with no elements {}.

What regular expression matches on an empty line?

The most portable regex would be ^[ \t\n]*$ to match an empty string (note that you would need to replace \t and \n with tab and newline accordingly) and [^ \n\t] to match a non-whitespace string.

What does \+ mean in regex?

Example: The regex "aa\n" tries to match two consecutive "a"s at the end of a line, inclusive the newline character itself. Example: "a\+" matches "a+" and not a series of one or "a"s. ^ the caret is the anchor for the start of the string, or the negation symbol. Example: "^a" matches "a" at the start of the string.


2 Answers

There are a lot of pattern types that can match empty strings. The OP regex belongs to an ^.*$ type, and it is easy to modify it to prevent empty string matching by replacing * (= {0,}) quantifier (meaning zero or more) with the + (= {1,}) quantifier (meaning one or more), as has already been mentioned in the posts here.

There are other pattern types matching empty strings, and it is not always obvious how to prevent them from matching empty strings.

Here are a few of those patterns with solutions:

[^"\\]*(?:\\.[^"\\]*)* (?:[^"\\]|\\.)+

abc||def abc|def (remove the extra | alternation operator)

^a*$ ^a+$ (+ matches 1 or more chars)

^(a)?(b)?(c)?$ ^(?!$)(a)?(b)?(c?)$ (the (?!$) negative lookahead fails the match if end of string is at the start of the string)
              or         ^(?=.)(a)?(b)?(c?)$ (the (?=.) positive lookahead requires at least a single char, . may match or not line break chars depending on modifiers/regex flavor)

^$|^abc$ ^abc$ (remove the ^$ alternative that enables a regex to match an empty string)

^(?:abc|def)?$ ^(?:abc|def)$ (remove the ? quantifier that made the (?:abc|def) group optional)

To make \b(?:north|south)?(?:east|west)?\b (that matches north, south, east, west, northeast, northwest, southeast, southwest), the word boundaries must be precised: make the initial word boundary only match start of words by adding (?<!\w) after it, and let the trailing word boundary only match at the end of words by adding (?!\w) after it.

\b(?:north|south)?(?:east|west)?\b \b(?<!\w)(?:north|south)?(?:east|west)?\b(?!\w)

like image 58
Wiktor Stribiżew Avatar answered Oct 25 '22 00:10

Wiktor Stribiżew


Replace "*" with "+", as "*" means "0 or more occurrences", while "+" means "at least one occurrence"

like image 39
lejlot Avatar answered Oct 25 '22 01:10

lejlot