Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python regex to extract phone numbers from string

I am very new to regex , Using python re i am looking to extract phone numbers from the following multi-line string text below :

 Source = """<p><strong>Kuala Lumpur</strong><strong>:</strong> +60 (0)3 2723 7900</p>
        <p><strong>Mutiara Damansara:</strong> +60 (0)3 2723 7900</p>
        <p><strong>Penang:</strong> + 60 (0)4 255 9000</p>
        <h2>Where we are </h2>
        <strong>&nbsp;Call us on:</strong>&nbsp;+6 (03) 8924 8686
        </p></div><div class="sys_two">
    <h3 class="parentSchool">General enquiries</h3><p style="FONT-SIZE: 11px">
     <strong>&nbsp;Call us on:</strong>&nbsp;+6 (03) 8924 8000
+ 60 (7) 268-6200 <br />
 Fax:<br /> 
 +60 (7) 228-6202<br /> 
Phone:</strong><strong style="color: #f00">+601-4228-8055</strong>"""

So when i compile the pattern , i should be able to find using

phone = re.findall(pattern,source,re.DOTALL)

 ['+60 (0)3 2723 7900',
  '+60 (0)3 2723 7900',
  '+ 60 (0)4 255 9000',
  '+6 (03) 8924 8686',
  '+6 (03) 8924 8000',
  '+ 60 (7) 268-6200',
  '+60 (7) 228-6202',
  '+601-4228-8055']

Please help me identify the right pattern

like image 759
Shekhar Samanta Avatar asked May 23 '16 14:05

Shekhar Samanta


People also ask

How to extract phone number from string in Python?

Here, we will see a Python program to extract phone number from string using sub () method. The regular expression in Python is a search pattern formed by a sequence of characters. The sub () method is used to replace all occurrences of a pattern in the string with a substring/character.

How to extract Phone numbers by using regular expression?

If you want to extract phone numbers by using Regular Expression but don’t know how to write Regular Extraction, the article may help you with this. It could be multiple phone numbers in a single large string and these phone numbers could come in a variety of formats. Here is an example of the file format: (021)1234567. (123) 456 7899.

How to find a phone number from a text string?

The best option for search and validation of data like phones numbers, zip codes, identifiers is Regular expression or Regex. Next, we'll see the examples to find, extract or validate phone numbers from a given text or string. The article starts with easy examples and finishes with advanced ones.

How can you use Python regex to check if a string?

How can you use the Python Regex library to check if a string represents a phone number? To check if a string matches a specific pattern use the Regex library’s match or exec methods. Before writing your Regex pattern inspect the variants for the phone number field to see whether you’re Regex pattern will match.


2 Answers

This should find all the phone numbers in a given string

re.findall(r'+?(?[1-9][0-9 .-()]{8,}[0-9]', Source)

 >>> re.findall(r'[\+\(]?[1-9][0-9 .\-\(\)]{8,}[0-9]', Source)
 ['+60 (0)3 2723 7900', '+60 (0)3 2723 7900', '60 (0)4 255 9000', '+6 (03) 8924 8686', '+6 (03) 8924 8000', '60 (7) 268-6200', '+60 (7) 228-6202', '+601-4228-8055']

Basically, the regex lays out these rules

  1. The matched string may start with + or ( symbol
  2. It has to be followed by a number between 1-9
  3. It has to end with a number between 0-9
  4. It may contain 0-9 (space) .-() in the middle.
like image 60
Sharmila Avatar answered Sep 25 '22 00:09

Sharmila


Using re module.

>>> import re
>>> Source = """<p><strong>Kuala Lumpur</strong><strong>:</strong> +60 (0)3 2723 7900</p>
        <p><strong>Mutiara Damansara:</strong> +60 (0)3 2723 7900</p>
        <p><strong>Penang:</strong> + 60 (0)4 255 9000</p>
        <h2>Where we are </h2>
        <strong>&nbsp;Call us on:</strong>&nbsp;+6 (03) 8924 8686
        </p></div><div class="sys_two">
    <h3 class="parentSchool">General enquiries</h3><p style="FONT-SIZE: 11px">
     <strong>&nbsp;Call us on:</strong>&nbsp;+6 (03) 8924 8000
+ 60 (7) 268-6200 <br />
 Fax:<br /> 
 +60 (7) 228-6202<br /> 
Phone:</strong><strong style="color: #f00">+601-4228-8055</strong>"""

>>> for i in re.findall(r'\+[-()\s\d]+?(?=\s*[+<])', Source):
    print i


+60 (0)3 2723 7900
+60 (0)3 2723 7900
+ 60 (0)4 255 9000
+6 (03) 8924 8686
+6 (03) 8924 8000
+ 60 (7) 268-6200
+60 (7) 228-6202
+601-4228-8055
>>> 
like image 30
Avinash Raj Avatar answered Sep 23 '22 00:09

Avinash Raj