I am very new to regex , Using python re i am looking to extract phone numbers from the following multi-line string text below :
Source = """<p><strong>Kuala Lumpur</strong><strong>:</strong> +60 (0)3 2723 7900</p>
<p><strong>Mutiara Damansara:</strong> +60 (0)3 2723 7900</p>
<p><strong>Penang:</strong> + 60 (0)4 255 9000</p>
<h2>Where we are </h2>
<strong> Call us on:</strong> +6 (03) 8924 8686
</p></div><div class="sys_two">
<h3 class="parentSchool">General enquiries</h3><p style="FONT-SIZE: 11px">
<strong> Call us on:</strong> +6 (03) 8924 8000
+ 60 (7) 268-6200 <br />
Fax:<br />
+60 (7) 228-6202<br />
Phone:</strong><strong style="color: #f00">+601-4228-8055</strong>"""
So when i compile the pattern , i should be able to find using
phone = re.findall(pattern,source,re.DOTALL)
['+60 (0)3 2723 7900',
'+60 (0)3 2723 7900',
'+ 60 (0)4 255 9000',
'+6 (03) 8924 8686',
'+6 (03) 8924 8000',
'+ 60 (7) 268-6200',
'+60 (7) 228-6202',
'+601-4228-8055']
Please help me identify the right pattern
Here, we will see a Python program to extract phone number from string using sub () method. The regular expression in Python is a search pattern formed by a sequence of characters. The sub () method is used to replace all occurrences of a pattern in the string with a substring/character.
If you want to extract phone numbers by using Regular Expression but don’t know how to write Regular Extraction, the article may help you with this. It could be multiple phone numbers in a single large string and these phone numbers could come in a variety of formats. Here is an example of the file format: (021)1234567. (123) 456 7899.
The best option for search and validation of data like phones numbers, zip codes, identifiers is Regular expression or Regex. Next, we'll see the examples to find, extract or validate phone numbers from a given text or string. The article starts with easy examples and finishes with advanced ones.
How can you use the Python Regex library to check if a string represents a phone number? To check if a string matches a specific pattern use the Regex library’s match or exec methods. Before writing your Regex pattern inspect the variants for the phone number field to see whether you’re Regex pattern will match.
This should find all the phone numbers in a given string
re.findall(r'+?(?[1-9][0-9 .-()]{8,}[0-9]', Source)
>>> re.findall(r'[\+\(]?[1-9][0-9 .\-\(\)]{8,}[0-9]', Source)
['+60 (0)3 2723 7900', '+60 (0)3 2723 7900', '60 (0)4 255 9000', '+6 (03) 8924 8686', '+6 (03) 8924 8000', '60 (7) 268-6200', '+60 (7) 228-6202', '+601-4228-8055']
Basically, the regex lays out these rules
Using re
module.
>>> import re
>>> Source = """<p><strong>Kuala Lumpur</strong><strong>:</strong> +60 (0)3 2723 7900</p>
<p><strong>Mutiara Damansara:</strong> +60 (0)3 2723 7900</p>
<p><strong>Penang:</strong> + 60 (0)4 255 9000</p>
<h2>Where we are </h2>
<strong> Call us on:</strong> +6 (03) 8924 8686
</p></div><div class="sys_two">
<h3 class="parentSchool">General enquiries</h3><p style="FONT-SIZE: 11px">
<strong> Call us on:</strong> +6 (03) 8924 8000
+ 60 (7) 268-6200 <br />
Fax:<br />
+60 (7) 228-6202<br />
Phone:</strong><strong style="color: #f00">+601-4228-8055</strong>"""
>>> for i in re.findall(r'\+[-()\s\d]+?(?=\s*[+<])', Source):
print i
+60 (0)3 2723 7900
+60 (0)3 2723 7900
+ 60 (0)4 255 9000
+6 (03) 8924 8686
+6 (03) 8924 8000
+ 60 (7) 268-6200
+60 (7) 228-6202
+601-4228-8055
>>>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With