Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using regex to match numbers which have 5 increasing consecutive digits somewhere in them

Tags:

regex

First off, this has sort of been asked before. However I haven't been able to modify this to fit my requirement.

In short: I want a regex that matches an expression if and only if it only contains digits, and there are 5 (or more) increasing consecutive digits somewhere in the expression.

I understand the logic of

^(?=\d{5}$)1*2*3*4*5*6*7*8*9*0*$

however, this limits the expression to 5 digits. I want there to be able to be digits before and after the expression. So 1111345671111 should match, while 11111 shouldn't.

I thought this might work:

^[0-9]*(?=\d{5}0*1*2*3*4*5*6*7*8*9*)[0-9]*$

which I interpret as:

  • ^$: The entire expression must only contain what's between these 2 symbols

  • [0-9]*: Any digits between 0-9, 0 or more times followed by:

  • (?=\d{5}0*1*2*3*4*5*6*7*8*9*): A part where at least 5 increasing digits are found followed by:

  • [0-9]*: Any digits between 0-9, 0 or more times.

However this regex is incorrect, as for example 11111 matches. How can I solve this problem using a regex? So examples of expressions to match:

  • 00001459000
  • 12345

This shouldn't match:

  • abc12345
  • 9871234444
like image 505
oscarloo Avatar asked Jan 26 '19 20:01

oscarloo


People also ask

How does regex Match 5 digits?

match(/(\d{5})/g);

How do you match a sequence in regex?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).

Can you use regex for numbers?

The regex [0-9] matches single-digit numbers 0 to 9. [1-9][0-9] matches double-digit numbers 10 to 99. That's the easy part. Matching the three-digit numbers is a little more complicated, since we need to exclude numbers 256 through 999.

How to match numbers and number ranges in regular expressions?

Similarly the range [0-255] will match 0,1,2,5. First is the range 0-2 which is in a character class will match 0,1,2 and 5 written two times, will match 5. Now lets begin the logic and philosophy of matching numbers and number ranges in Regular expressions.

Can regex tester match 6+ digit numbers?

No more, no less. Won't match the first 5 numbers of a 6+ digit number Regex Tester isn't optimized for mobile devices yet. You can still take a look, but it might be a bit quirky.

How to match a string with a regular expression in Java?

Match the given string with the Regular Expression. In Java, this can be done by using Pattern.matcher (). Return true if the string matches with the given regular expression, else return false. Below is the implementation of the above approach: // consecutive identical characters or numbers. // consecutive identical characters or numbers.

How to check for 3 consecutive identical characters in a string?

Get the String. Create a regular expression to check 3 or more consecutive identical characters or numbers as mentioned below: \\b represents the word boundary. ( represents the starting of the group 1. [a-zA-Z0-9] represents a letter or a digit.


2 Answers

While this problem can be solved using pure regular expressions (the set of strictly ascending five-digit strings is finite, so you could just enumerate all of them), it's not a good fit for regexes.

That said, here's how I'd do it if I had to:

^\d*(?=\d{5}(\d*)$)0?1?2?3?4?5?6?7?8?9?\1$

Core idea: 0?1?2?3?4?5?6?7?8?9? matches an ascending numeric substring, but it doesn't restrict its length. Every single part is optional, so it can match anything from "" (empty string) to the full "0123456789".

We can force it to match exactly 5 characters by combining a look-ahead of five digits and an arbitrary suffix (which we capture) and a backreference \1 (which must exactly the suffix matched by the look-ahead, ensuring we've now walked ahead 5 characters in the string).

Live demo: https://regex101.com/r/03rJET/3

(By the way, your explanation of (?=\d{5}0*1*2*3*4*5*6*7*8*9*) is incorrect: It looks ahead to match exactly 5 digits, followed by 0 or more occurrences of 0, followed by 0 or more occurrences of 1, etc.)

like image 91
melpomene Avatar answered Oct 22 '22 01:10

melpomene


Because the starting position of the increasing digits isn't known in advance, and the consecutive increasing digits don't end at the end of the string, the linked answer's concise pattern won't work here. I don't think this is possible without being repetitive; alternate between all possibilities of increasing digits. A 0 must be followed by [1-9]. (0(?=[1-9])) A 1 must be followed by [2-9]. A 2 must be followed by [3-9], and so on. Alternate between these possibilities in a group, and repeat that group four times, and then match any digit after that (the lookahead in the last repeated digit in the previous group will ensure that this 5th digit is in sequence as well).

First lookahead for digits followed by the end of the string, then match the alternations described above, followed by one or more digits:

^(?=\d+$)\d*?(?:0(?=[1-9])|1(?=[2-9])|2(?=[3-9])|3(?=[4-9])|4(?=[5-9])|5(?=[6-9])|6(?=[7-9])|7(?=[89])|8(?=9)){4}\d+

Separated out for better readability:

^(?=\d+$)\d*?
  (?:
    0(?=[1-9])|
    1(?=[2-9])|
    2(?=[3-9])|
    3(?=[4-9])|
    4(?=[5-9])|
    5(?=[6-9])|
    6(?=[7-9])|
    7(?=[89])|
    8(?=9)
  ){4}
\d+

The lazy quantifier in the first line there \d*? isn't necessary, but it makes the pattern a bit more efficient (otherwise it initially greedily matches the whole string, requiring lots of failing alternations and backtracking until at least 5 characters before the end of the string)

https://regex101.com/r/03rJET/2

It's ugly, but it works.

like image 31
CertainPerformance Avatar answered Oct 22 '22 01:10

CertainPerformance