I have regex which works fine in my application, but it matches an empty string too, i.e. no error occurs when the input is empty. How do I modify this regex so that it will not match an empty string ? Note that I DON'T want to change any other functionality of this regex.
This is the regex which I'm using: ^([0-9\(\)\/\+ \-]*)$
I don't know a lot about regex formulation myself, which is why I'm asking. I have searched for an answer, but couldn't find a direct one. Closest I got to was this: regular expression for anything but an empty string in c#, but that doesn't really work for me ..
An empty regular expression matches everything.
∅, the empty set, is a regular expression. ∅ represent the language with no elements {}.
The most portable regex would be ^[ \t\n]*$ to match an empty string (note that you would need to replace \t and \n with tab and newline accordingly) and [^ \n\t] to match a non-whitespace string.
Example: The regex "aa\n" tries to match two consecutive "a"s at the end of a line, inclusive the newline character itself. Example: "a\+" matches "a+" and not a series of one or "a"s. ^ the caret is the anchor for the start of the string, or the negation symbol. Example: "^a" matches "a" at the start of the string.
There are a lot of pattern types that can match empty strings. The OP regex belongs to an ^.*$
type, and it is easy to modify it to prevent empty string matching by replacing *
(= {0,}
) quantifier (meaning zero or more) with the +
(= {1,}
) quantifier (meaning one or more), as has already been mentioned in the posts here.
There are other pattern types matching empty strings, and it is not always obvious how to prevent them from matching empty strings.
Here are a few of those patterns with solutions:
[^"\\]*(?:\\.[^"\\]*)*
⇒(?:[^"\\]|\\.)+
abc||def
⇒abc|def
(remove the extra|
alternation operator)
^a*$
⇒^a+$
(+
matches 1 or more chars)
^(a)?(b)?(c)?$
⇒^(?!$)(a)?(b)?(c?)$
(the(?!$)
negative lookahead fails the match if end of string is at the start of the string)
or ⇒^(?=.)(a)?(b)?(c?)$
(the(?=.)
positive lookahead requires at least a single char,.
may match or not line break chars depending on modifiers/regex flavor)
^$|^abc$
⇒^abc$
(remove the^$
alternative that enables a regex to match an empty string)
^(?:abc|def)?$
⇒^(?:abc|def)$
(remove the?
quantifier that made the(?:abc|def)
group optional)
To make \b(?:north|south)?(?:east|west)?\b
(that matches north
, south
, east
, west
, northeast
, northwest
, southeast
, southwest
), the word boundaries must be precised: make the initial word boundary only match start of words by adding (?<!\w)
after it, and let the trailing word boundary only match at the end of words by adding (?!\w)
after it.
\b(?:north|south)?(?:east|west)?\b
⇒\b(?<!\w)(?:north|south)?(?:east|west)?\b(?!\w)
Replace "*" with "+", as "*" means "0 or more occurrences", while "+" means "at least one occurrence"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With