I have a list of email addresses with the following format:
name###@email.com
But the number is not always present. For example: [email protected], [email protected] [email protected], etc. I want to sort these names by the number, with those without a number coming first. I have come up with something that works, but being new to Python, I'm curious as to whether there's a better way of doing it. Here is my solution:
import re
def sortKey(name):
m = re.search(r'(\d+)@', name)
return int(m.expand(r'\1')) if m is not None else 0
names = [ ... a list of emails ... ]
for name in sorted(names, key = sortKey):
print name
This is the only time in my script that I am ever using "sortKey", so I would prefer it to be a lambda function, but I'm not sure how to do that. I know this will work:
for name in sorted(names, key = lambda n: int(re.search(r'(\d+)@', n).expand(r'\1')) if re.search(r'(\d+)@', n) is not None else 0):
print name
But I don't think I should need to call re.search twice to do this. What is the most elegant way of doing this in Python?
Better using re.findall as if no numbers are found, then it returns an empty list which will sort before a populated list. The key used to sort is any numbers found (converted to ints), followed by the string itself...
emails = '[email protected] [email protected] [email protected]'.split()
import re
print sorted(emails, key=lambda L: (map(int, re.findall('(\d+)@', L)), L))
# ['[email protected]', '[email protected]', '[email protected]']
And using john1 instead the output is: ['[email protected]', '[email protected]', '[email protected]'] which shows that although lexicographically after joe, the number has been taken into account first shifting john ahead.
There is a somewhat hackish way if you wanted to keep your existing method of using re.search in a one-liner (but yuck):
getattr(re.search('(\d+)@', s), 'groups', lambda: ('0',))()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With