Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there such a thing as a telephone number cleanser library

I would like to find a tool/library that can take user-entered free format telephone numbers entered through a web site and parse them into a number suitable for calling from a given country.

This is not quite as simple as it may at first sound. The website allows people all around the world to enter their number in any way they choose, so many people enter national numbers (a country is also provided in another field). Some people enter the international number in "correct" formats (with the "+"), some people enter it slightly less correctly, using their country's international prefix.

I would like to tell the library/tool the country that I'm dialling from, and the free-format telephone number and optionally the country that this corresponds to (as this will help generate the international code if not entered), and it use known patterns to best-guess the number that will work in my calling country

So, for example, when calling from the US to a UK number:

+44 (0) 1225-344567 => 011441225344567

Or when calling from the UK to a US number:

(613) 4562342 => 0016134562342

Anyone know of any (ideally .NET Framework-friendly) solutions that will avoid me undoubtedly re-inventing the wheel

like image 377
Kram Avatar asked Jun 27 '11 13:06

Kram


1 Answers

I've implemented just this, Mark. I work for a wireless carrier and have an international SMS sending application. I am not aware of any third party libraries that implement these rules. As mentioned above, one cannot just deal with random input as the phone number formats vary among countries. Some countries such as Germany have variable length area codes and phone numbers. If they don't put the country code in there you are sunk. However, in my case I can assume if it's missing a country code it's a USA phone number. The results of my filter have proven to be very accurate with the users and input I've had.

One can make some assumptions, and by knowing the target audience, logging the inputs, and analyzing, one can get things dialed in. My first implementation was for a "white label" web app that is used for testing by various people around the world. I rapidly found that most foreign people have their act together and are used to the quasi-standard + format. It's usually Americans entering phone numbers that are most confused. Europeans are very used to international dialing.

First rule is to strip out everything but digits and a leading '+'.

If the resulting number has fewer than 8 digits, it's junk, provide user error.

If the resulting number starts with a '+' assume it is the standard format and that the next 2-6 digits represent the "country code". Figure out the country code then process the remaining digits according to the rules for that country.

If the number starts with 0, assume someone put in an international access prefix, and strip off leading 0's and 1's, then: if the remaining number is 10 digits, assume it was a USA number entered by an American, and handle accordingly. If the remaining number is not 10 digits, but at least 8, assume the first 2-6 digits are a country code, lookup the country code, and process according to that country's rules.

If the number starts with a 1, and it is 11 digits total, assume it is a USA (or Caribbean Island) number, and process accordingly.

IF the number starts with a 1 and is not 11 digits total, strip the leading 1s, see if there are at least 8 digits left, and assume the remaining leading 2 to 6 digits are a country code and process per the country rules.

Finally, with the number not leading with +, 0, or 1, and being at least 8 digits, assume it is in the standard notation without the +, that is country code first, use the first 2 to 6 digits as the country code and process according to that country's rules.

The trick in all of this is to have a mapping of all the world's country codes, and the numbering plan information for each country. I have that map, and the rules for many countries. If you would like this information I'd be happy to share, along with some C# code that figures out which country. Message me.

A huge help in this is to post back the name of the country your software is guessing to the user. They will understand rapidly if they are trying to enter a German phone number and your software asks them if they are trying to dial Guam!

like image 99
Christo Avatar answered Nov 09 '22 23:11

Christo