Trying to determine a single last name.
names = ["John Smith", "D.J. Richies III","AJ Hardie Jr.", "Shelia Jackson-Lee", "Bob O'Donnell"]
Desired Output
last_names = ['Smith', 'Richies','Hardie','Lee', 'ODonnell' ]
I'm hoping there is an existing library or set of code that can easily handle some of these more rare/odd cases.
Thanks for your help!
Naive string-manipulation solutions will eventually fail. You start to realize this with suffixes (III, Jr.), but what about compound last names like de la Paz?
You want: The Python Human Name Parser
>>> from nameparser import HumanName
>>> name = HumanName("Dr. Juan Q. Xavier de la Vega III")
>>> name.title
'Dr.'
>>> name["title"]
'Dr.'
>>> name.first
'Juan'
>>> name.middle
'Q. Xavier'
>>> name.last
'de la Vega'
>>> name.suffix
'III'
You can try this:
names = ["John Smith", "D.J. Richies III","AJ Hardie Jr.", "Shelia Jackson-Lee", "Bob O'Donnell"]
suffixes = ["II", "Jr.", "III", "Sr."]
last_names = []
for i in names:
new_name = i.split()
if len(new_name) == 2 and "-" in new_name[1]:
last_names.append(new_name[1].split("-")[1])
elif len(new_name) == 2:
last_names.append(new_name[1])
else:
if new_name[-1] in suffixes:
last_names.append(new_name[1])
print(last_names)
Output will contain the last names:
['Smith', 'Richies', 'Hardie', 'Lee', "O'Donnell"]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With