Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Phone Number formatting using Regex [duplicate]

Tags:

c#

regex

Possible Duplicate:
A comprehensive regex for phone number validation

I have an unformatted phone number (guaranteed to be 10 digits) and an unformatted extension (could be null, blank or any number of numbers). I need to put them together into a "friendly" string. I thought I'd concatenate them, then format the concatenation using a Regex.Replace. Here's the unit test I'm using to try various regexes before I plug one in:

    [Test, Ignore("Sandbox, does not test production code")]
    public void TestPhoneRegex()
    {
        string number = "1234567890";
        string extension = "";

        var formattedContactNumber =
            Regex.Replace("{0} x{1}".FormatWith(number, extension),
                          @"^(\d{3})[ -]?(\d{3})[ -]?(\d{4})( x\d+)?",
                          @"$1-$2-$3$4");

        Debug.WriteLine("{0} x{1}".FormatWith(number, extension));
        Debug.WriteLine(formattedContactNumber);

        Assert.AreEqual("123-456-7890", formattedContactNumber);
    }

The expected formatted string is the formatted phone number, without the "x" and extension. However, the last capture group is matching the "x" with or without a number behind it, so instead of "123-456-7890" I get "123-456-7890 x". This is the last bit of development that needs to be tied down before a release. Help?

like image 388
KeithS Avatar asked Feb 11 '11 21:02

KeithS


People also ask

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1. 1* means any number of ones.

What does \+ mean in regex?

For examples, \+ matches "+" ; \[ matches "[" ; and \. matches "." . Regex also recognizes common escape sequences such as \n for newline, \t for tab, \r for carriage-return, \nnn for a up to 3-digit octal number, \xhh for a two-digit hex code, \uhhhh for a 4-digit Unicode, \uhhhhhhhh for a 8-digit Unicode.

How do you repeat in regex?

A repeat is an expression that is repeated an arbitrary number of times. An expression followed by '*' can be repeated any number of times, including zero. An expression followed by '+' can be repeated any number of times, but at least once.


3 Answers

I love regular expressions, don't get me wrong, but this does not seem like a useful area to apply them. All you are doing is adding dashes to a string of 10 numbers then adding an optional "x" followed by an extension. Simpler is better.

public static String beautifyPhoneNumber(String number, String extension)
{
    String beautifulNumber = number.Substring(0, 3) + "-" +
                             number.Substring(3, 3) + "-" +
                             number.Substring(6, 4);
    if (!String.IsNullOrEmpty(extension))
    {
        beautifulNumber += " x" + extension;
    }
    return beautifulNumber;
}
like image 180
John McDonald Avatar answered Oct 31 '22 18:10

John McDonald


x isn't matched by your regex, so it isn't replaced put of the string. Try this regex instead:

@"^(\d{3})[ -]?(\d{3})[ -]?(\d{4}) x(\d*)

In the new regex x isn't optional - it will always be there according to your code (If you do want it to be optional you can use ?x?(\d*)). Also, we're using \d*, so make sure the last group will always match, even when it's empty.

like image 31
Kobi Avatar answered Oct 31 '22 17:10

Kobi


This is maybe not a direct answer to your question, but possibly helpful... We use this pattern:

public const string NorthAmericanPhonePattern = @"^(\+?(?<NatCode>1)\s*[-\/\.]?)?(\((?<AreaCode>\d{3})\)|(?<AreaCode>\d{3}))\s*[-\/\.]?\s*(?<Number1>\d{3})\s*[-\/\.]?\s*(?<Number2>\d{4})\s*(([xX]|[eE][xX][tT])\.?\s*(?<Ext>\d+))*$";

And then reformat with:

private static string PhoneNumberMatchEvaluator(Match match)
{
    // Format to north american style phone numbers "0 (000) 000-0000"
    //                                          OR  "(000) 000-0000"
    Debug.Assert(match.Success);
    if (match.Groups["NatCode"].Success)
    {
        return match.Result("${NatCode} (${AreaCode}) ${Number1}-${Number2}");
    }
    else
    {
        return match.Result("(${AreaCode}) ${Number1}-${Number2}");
    }
}

private static string FormatPhoneNumber(string phoneNumber)
{
    var regex = new Regex(NorthAmericanPhonePattern, RegexOptions.IgnoreCase);
    return regex.Replace(phoneNumber, new MatchEvaluator(PhoneNumberMatchEvaluator));
}

Note: In our case we have included the national code if they did, you could easily take that out. We have also not included the extension in there - as we shift it out and put into a different field if we find it.

like image 2
Reddog Avatar answered Oct 31 '22 17:10

Reddog