I have a pandas dataframe containing (besides other columns) full names:
fullname
martin master
andreas test
I want to create a new column which splits the fullname column along the blank space and assigns the last element to a new column. The result should look like:
fullname lastname
martin master master
andreas test test
I thought it would work like this:
df['lastname'] = df['fullname'].str.split(' ')[-1]
However, I get a KeyError: -1
I use [-1]
, that is the last element of the split group, in order to be sure that I get the real last name. In some cases (e.g. a name like andreas martin master), this helps to get the last name, that is, master.
So how can I do this?
split() Pandas provide a method to split string around a passed separator/delimiter. After that, the string can be stored as a list in a series or it can also be used to create multiple column data frames from a single separated string.
We can use str. split() to split one column to multiple columns by specifying expand=True option. We can use str. extract() to exract multiple columns using regex expression in which multiple capturing groups are defined.
Python3. Pandas iloc is used to retrieve data by specifying its integer index. In python negative index starts from end therefore we can access the last element by specifying index to -1 instead of length-1 which will yield the same result.
We can use the pandas Series. str. split() function to break up strings in multiple columns around a given separator or delimiter. It's similar to the Python string split() method but applies to the entire Dataframe column.
You need another str
to access the last splits for every row, what you did was essentially try to index the series using a non-existent label:
In [31]:
df['lastname'] = df['fullname'].str.split().str[-1]
df
Out[31]:
fullname lastname
0 martin master master
1 andreas test test
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With