Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split pandas column and add last element to a new column

I have a pandas dataframe containing (besides other columns) full names:

 fullname
 martin master
 andreas test

I want to create a new column which splits the fullname column along the blank space and assigns the last element to a new column. The result should look like:

 fullname           lastname
 martin master      master
 andreas test       test

I thought it would work like this:

df['lastname'] = df['fullname'].str.split(' ')[-1]

However, I get a KeyError: -1

I use [-1], that is the last element of the split group, in order to be sure that I get the real last name. In some cases (e.g. a name like andreas martin master), this helps to get the last name, that is, master.

So how can I do this?

like image 435
beta Avatar asked Jul 21 '16 08:07

beta


People also ask

How do I split a string into another column in pandas?

split() Pandas provide a method to split string around a passed separator/delimiter. After that, the string can be stored as a list in a series or it can also be used to create multiple column data frames from a single separated string.

How do I split a single column into multiple columns in Python?

We can use str. split() to split one column to multiple columns by specifying expand=True option. We can use str. extract() to exract multiple columns using regex expression in which multiple capturing groups are defined.

How do I get the last element in pandas?

Python3. Pandas iloc is used to retrieve data by specifying its integer index. In python negative index starts from end therefore we can access the last element by specifying index to -1 instead of length-1 which will yield the same result.

How do you split items into multiple columns in a data frame?

We can use the pandas Series. str. split() function to break up strings in multiple columns around a given separator or delimiter. It's similar to the Python string split() method but applies to the entire Dataframe column.


1 Answers

You need another str to access the last splits for every row, what you did was essentially try to index the series using a non-existent label:

In [31]:

df['lastname'] = df['fullname'].str.split().str[-1]
df
Out[31]:
         fullname lastname
0   martin master   master
1    andreas test     test
like image 93
EdChum Avatar answered Oct 11 '22 15:10

EdChum