Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to match digits and at most one space between them

Tags:

python

regex

I am trying to match phone numbers in the following patterns:

9 99 99 99 99
0999999999
11 0999999999
9 9999 9999

But not the following:

9 99  99 99 99 (two spaces)
9 99\n99 99 99 

Therefore, I want to match 7 to 12 digits and an optional spaces between them, but not sequences of more than one white space.

So far I came up with "[\d ?]{7,12}", but it doesn't really match the requirements as the spaces are counted in the {7,12} and it also matches two sequences of spaces.

like image 817
jcp Avatar asked Aug 31 '18 16:08

jcp


People also ask

How do you match a space in regex?

If you're looking for a space, that would be " " (one space). If you're looking for one or more, it's " *" (that's two spaces and an asterisk) or " +" (one space and a plus).

How does regex Match 5 digits?

match(/(\d{5})/g);

What does \d do in regex?

\d (digit) matches any single digit (same as [0-9] ). The uppercase counterpart \D (non-digit) matches any single character that is not a digit (same as [^0-9] ). \s (space) matches any single whitespace (same as [ \t\n\r\f] , blank, tab, newline, carriage-return and form-feed).


3 Answers

[\d ?]{7,12} is a pattern that matches 7 to 12 digit, space or ? chars. It can match a ??????? string because ? is not a quantifier, but a mere question mark symbol when declared inside a character class.

If you change it to (?:\d ?){7,12}, you may partially solve the problem, the space at the end. I suggest using

\b\d(?: ?\d){6,11}\b

See the regex demo

The word boundaries \b will make sure you only match whole words.

Details

  • \b - leading word boundary
  • \d - a digit
  • (?: ?\d){6,11} - 6 to 11 consecutive sequences of
    • ? - an optional space
    • \d - a single digit
  • \b - trailing word boundary.
like image 161
Wiktor Stribiżew Avatar answered Nov 13 '22 17:11

Wiktor Stribiżew


you can use

\d(\s?\d){6,11}

the first \d matches on the first digit. Next can follow a group of 6 to 11 (to make a total of 7 to 12) pairs of an optional space, followed by a digit. Multiple spaces are not allowed, as you see each optional space has digits to both sides. It can be checked here That regexp is equivalent, but shorter, to this one:

\d\s?\d\s?\d\s?\d\s?\d\s?\d((((((\s?\d)?\s?\d)?\s?\d)?\s?\d)?\s?\d)?\s?\d)?

that can be checked here.

NOTE

See that the \s matches a newline, so you can get multiline number (as shown in the examples) If you don't like that behaviour, then narrow the space class using a simple space, as in

\d( ?\d){6,11}

that can be tested here Look, that now, a more than 12 digits number is truncated to only the first twelve, if this is not desired, use word boundary at the end, as in

\d( ?\d){6,11}\b

See it here.

like image 36
Luis Colorado Avatar answered Nov 13 '22 17:11

Luis Colorado


I'd try

(?:\d+ ?){7,12}

The original regex was matching a character group of a space OR a digit seven to twelve times. The supplied regex matches a digit followed by a possible space seven to twelve times. That way the spaces aren't counted as part of the total.

like image 27
watt smith Avatar answered Nov 13 '22 16:11

watt smith