Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expression - 4 digits in a row, but can't be all zeros

Tags:

regex

I am looking for a solution that can exclusively be done with a regular expression. I know this would be easy with variables, substrings, etc.

And I am looking for PCRE style regex syntax even though I mention vim.

I need to identify strings with 4 numeric digits, and they can't be all 0's. So the following strings would be a match:

0001 
1000 
1234 
0101

And this would not:

0000

This is a substring that will occur at a set location within a large string, if that matters; I don't think it should. For example

xxxxxxxxxxxx0001xxxxx
xxxxxxxxxxxx1000xxxxx
xxxxxxxxxxxx1234xxxxx
xxxxxxxxxxxx0101xxxxx
xxxxxxxxxxxx0101xxxxx
xxxxxxxxxxxx0000xxxxx
like image 684
user210757 Avatar asked Jun 10 '11 20:06

user210757


People also ask

What is ?: In regex?

It indicates that the subpattern is a non-capture subpattern. That means whatever is matched in (?:\w+\s) , even though it's enclosed by () it won't appear in the list of matches, only (\w+) will.

How do you match a regular expression with digits?

\d (digit) matches any single digit (same as [0-9] ). The uppercase counterpart \D (non-digit) matches any single character that is not a digit (same as [^0-9] ). \s (space) matches any single whitespace (same as [ \t\n\r\f] , blank, tab, newline, carriage-return and form-feed).

How do you escape special characters in regex?

The backslash in a regular expression precedes a literal character. You also escape certain letters that represent common character classes, such as \w for a word character or \s for a space.


2 Answers

 (?<!\d)(?!0000)\d{4}(?!\d)

or, more kindly/maintainably/sanely:

m{
     (?<! \d   )    # current point cannot follow a digit
     (?!  0000 )    # current point must not precede "0000"
     \d{4}          # match four digits at this point, provided...
     (?!  \d   )    # that they are not then followed by another digit
}x
like image 63
tchrist Avatar answered Oct 02 '22 21:10

tchrist


Since I complained that the some of the answers here weren't regular expressions, I thought I'd best give you a regex answer. This is primitive, there's probably a better way, but it does work:

([1-9][0-9][0-9][0-9]|[0-9][1-9][0-9][0-9]|[0-9][0-9][1-9][0-9]|[0-9][0-9][0-9][1-9])

This checks for something which contains 0-9 in each location, except one which must lie in 1-9, preventing 0000 from matching. You can probably write this simpler using \d instead of [0-9] if your regex parser supports that metacharacter.

like image 32
Richard J Avatar answered Oct 02 '22 19:10

Richard J