I am extremely new to python. I often get text files that have phone numbers is various formats. I am trying to create a python script that takes this text file and normalizes them into a format I can use.
I am trying to remove all symbols and spaces and just leave the numbers. As well as add +1
to the beginning and a comma (,
) at the end.
import re
with open("test_numbers.txt") as file:
dirty = file.read()
clean = re.sub(r'[^0-9]', '', dirty)
print clean
I'm trying to use regex but it puts everything on a single line. Maybe I am going about this all wrong. I have not worked out a way to add the +1
to the beginning of the number or add a comma at the end. Would appreciate any advice.
This might help you:
import re
with open('test_numbers.txt') as f:
dirty = f.readlines()
clean = []
for l in dirty:
clean.apped('+1{},\n'.format(re.sub(r'[^0-9]', '', l)))
clean
will be a list of lines with +1
at the beginning and ,
at the end. You may then save it to a text file with:
with open('formatted_numbers.txt', 'w') as f:
f.writelines(clean)
You can also use a one liner using list comprehension:
clean = ['+1{},\n'.format(re.sub(r'[^0-9]', '', l)) for l in dirty]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With