Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove unwanted characters from phone number string

Tags:

python

regex

I am aiming for regex code to grab phone number and remove unneeded characters.

import re
strs = 'dsds +48 124 cat cat cat245 81243!!'
match = re.search(r'.[ 0-9\+\-\.\_]+', strs)

if match:                      
    print 'found', match.group() ## 'found word:cat'
else:
    print 'did not find'

It returns only:

+48 124 

How I can return the entire number?

like image 889
Efrin Avatar asked Jun 20 '12 11:06

Efrin


People also ask

How do I remove special characters and numbers from a string in Python?

Remove Special Characters From the String in Python Using the str. isalnum() Method. The str. isalnum() method returns True if the characters are alphanumeric characters, meaning no special characters in the string.


2 Answers

You want to use sub(), not search():

>>> strs = 'dsds +48 124 cat cat cat245 81243!!'
>>> re.sub(r"[^0-9+._ -]+", "", strs)
' +48 124   245 81243'

[^0-9+._ -] is a negated character class. The ^ is significant here - this expression means: "Match a characters that is neither a digit, nor a plus, a dot, an underscore, a space or a dash".

The + tells the regex engine to match one or more instances of the preceding token.

like image 80
Tim Pietzcker Avatar answered Oct 31 '22 19:10

Tim Pietzcker


The problem with re.sub() is that you get extra spaces in your final phone number string. The non-regular expression way, which returns the correct phone number (without any spaces):

>>> strs = 'dsds +48 124 cat cat cat245 81243!!'
>>> ''.join(x for x in strs if x.isdigit() or x == '+')
'+4812424581243'
like image 39
Burhan Khalid Avatar answered Oct 31 '22 19:10

Burhan Khalid