Sorry for asking such a simple question, I'm still an inexperienced programmer. I stumbled across a phone-number-matching regex in some old perl code at work, I'd love it if somebody could explain exactly what it means (my regex skills are severely lacking).
if ($value !~ /^\+[[:space:]]*[0-9][0-9.[:space:]-]*(\([0-9.[:space:]-]*[0-9][0-9.[:space:]-]*\))?([0-9.[:space:]-]*[0-9][0-9.[:space:]-]*)?([[:space:]]+ext.[0-9.[:space:]-]*[0-9][0-9.[:space:]-]*)?$/i) {
    ...
}
Thank you in advance :)
The code roughly says "you should replace this with Number::Phone".
All joking and good advice aside, first thing to do when figuring out a regex is to expand it with /x. First pass is to break things up by capture group.
/^
 \+[[:space:]]*[0-9][0-9.[:space:]-]*
 (\([0-9.[:space:]-]*[0-9][0-9.[:space:]-]*\))?
 ([0-9.[:space:]-]*[0-9][0-9.[:space:]-]*)?
 ([[:space:]]+ext.[0-9.[:space:]-]*[0-9][0-9.[:space:]-]*)?
$/xi
Then, since this is dominated by character sets, I'd space by character sets.
/^
 \+ [[:space:]]* [0-9] [0-9.[:space:]-]*
 ( \( [0-9.[:space:]-]* [0-9] [0-9.[:space:]-]* \) )?
 ( [0-9.[:space:]-]* [0-9] [0-9.[:space:]-]* )?
 ( [[:space:]]+ ext . [0-9.[:space:]-]* [0-9] [0-9.[:space:]-]* )?
$/xi
Now you can start to see some similar elements. Try lining those up to see the similarities.
/^
 \+        [[:space:]]* [0-9] [0-9.[:space:]-]*
 ( \( [0-9.[:space:]-]* [0-9] [0-9.[:space:]-]* \) )?
 (    [0-9.[:space:]-]* [0-9] [0-9.[:space:]-]*    )?
 ( [[:space:]]+ 
   ext . 
      [0-9.[:space:]-]* [0-9] [0-9.[:space:]-]* 
 )?
$/xi
Then zero in on an element and try figure it out.  This is the important one, [0-9.[:space:]-]* meaning "Zero or more numbers, spaces, dashes or dots".  That doesn't make much sense for phone parsing, maybe it will make more sense in context.  Let's look at a line we can guess what it's trying to do.
( \( [0-9.[:space:]-]* [0-9] [0-9.[:space:]-]* \) )?
The parens suggest this is trying to parse an area code.  The rest limits it to any number of numbers, spaces, dashes or dots, but the [0-9] ensures there is at least one number.  This is likely the author's way of dealing with the multitude of phone number formats.
Let's give this a name, call it phone_chars, because it's what the author has decided phone numbers are made of.  There's another element, the [0-9.[:space:]-]* [0-9] [0-9.[:space:]-]* which I'll call a "phone atom" because it's what the author decided an atom of a phone number can be.  If we put that in its own regex and build the phone regex with it, things become a lot clearer.
my $phone_chars = qr{[0-9.[:space:]-]};
my $phone_atom  = qr{$phone_chars* [0-9] $phone_chars*}x;
/^
 \+ [[:space:]]* [0-9] $phone_chars*
 ( \( $phone_atom \) )?
 (    $phone_atom    )?
 ( [[:space:]]+ ext . $phone_atom )?
$/xi;
If you know something about phone numbers, it's like this:
This regex doesn't do a very good job validating phone numbers. According to this regex, "+1" is a valid phone number, but "(555) 123-4567" isn't because it doesn't have a country code.
Phone number validation is hard. Did I mention Number::Phone?
use strict;
use warnings;
use v5.10;
use Number::Phone;
my $number = Number::Phone->new("+1(555)456-2398");
say $number->is_valid;
                        Amazing what extended mode, a little whitespace and a few comments can do ...
if ($value !~  /
      ^                 # Anchor to start of string
     \+                 # followed (immediately) by literal '+'
     [[:space:]]*       # zero or more chars in the POSIX character class 'space'
     [0-9]              # compolsory digit
     [0-9.[:space:]-]*  # zero or more digit, full-stop, space or hyphen
     (                  # start capture to $1
         \(                   # Literal open parentheses
         [0-9.[:space:]-]*    # zero or more ... (as above)
         [0-9]                # compolsory digit
         [0-9.[:space:]-]*    # zero or more ... (as above)
         \)                   # Literal close parentheses
     )?                 # close capture to $1 - whole thing optional
     (                  # start capture to $2
         [0-9.[:space:]-]*    # zero or more ... (as above)
         [0-9]                # compolsory digit
         [0-9.[:space:]-]*    # zero or more ... (as above)
     )?                 # close capture to $2 - whole thing optional
     (                  # start capture to $3
         [[:space:]]+         # at least one space (as definned by POSIX)
         ext.                 # literal 'ext' followed by any character
         [0-9.[:space:]-]*    # zero or more ... (as above)
         [0-9]                # compolsory digit
         [0-9.[:space:]-]*    # zero or more ... (as above)
     )?                 # close capture to $3 - whole thing optional
      $                 # Anchor to end of string
              /ix       # close regex; ignore case, extended mode options
   )  {
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With