I have a running python script that reads in a file of phone numbers. Some of these phone numbers are invalid.
import re
def IsValidNumber(number, pattern):
isMatch = re.search(pattern, number)
if isMatch is not None:
return number
numbers = [line.strip() for line in open('..\\phoneNumbers.txt', 'r')]
Then I use another list comprehension to filter out the bad numbers:
phonePattern = '^\d{10}$'
validPhoneNumbers = [IsValidNumber(x, phonePattern) for x in phoneNumbers
if IsValidNumber(x, phonePattern) is not None]
for x in validPhoneNumbers:
print x
Due to formatting, the second list comprehension spans two lines.
The problem is that although the IsValidNumber should only return the number if the match is valid, it also returns 'None' on invalid matches. So I had to modify the second list comprehension to include:
if IsValidNumber(x, phonePattern) is not None
While this works, the problem is that for each iteration in the list, the function is executed twice. Is there a cleaner approach to doing this?
Your isValidFunction
should return True/False (as its name suggests). That way your list comprehension becomes:
valid = [num for num in phoneNumbers if isValidNumber(num, pattern)]
While you're at it, modify numbers
to be a generator expression instead of a list comprehension (since you're interested in efficiency):
numbers = (line.strip() for line in open("..\\phoneNumbers.txt"))
Try this:
validPhoneNumbers = [x for x in phoneNumbers if isValidNumber(x, phonepattern)]
Since isValidNumber
returns the same number that's passed in, without modification, you don't actually need that number. You just need to know that a number is returned at all (meaning the number is valid).
You may be able to combine the whole thing as well, with:
validPhoneNumbers = [x.strip() for x in open('..\\phonenumbers.txt', 'r') if isValidNumber(x.strip(), phonePattern)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With