Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting column value into 2 new columns - Python Pandas

Tags:

python

pandas

I have a dataframe that has column 'name'. With values like 'James Cameron'. I'd like to split it out into 2 new columns 'First_Name' and 'Last_Name', but there is no delimiter in the data so I am not quite sure how. I realize that 'James' is in position [0] and 'Cameron' is in position [1], but I am not sure you can recognize that without the delimiter

df = pd.DataFrame({'name':['James Cameron','Martin Sheen'],
               'Id':[1,2]})
df

EDIT:

Vaishali's answer below worked perfectly, for the dataframe I had provided. I created that dataframe as an example though. My real code looks like this"

data[['First_Name','Last_Name']] = data.director_name.str.split(' ', expand = True)

and that unfortunately, is throwing an error:

'Columns must be same length as key'

The column holds the same values as my example though. Any suggestions?

Thanks

like image 892
JD2775 Avatar asked May 26 '17 17:05

JD2775


People also ask

How do I divide one column into multiple columns in pandas?

split() function is used to break up single column values into multiple columns based on a specified separator or delimiter. The Series. str. split() function is similar to the Python string split() method, but split() method works on the all Dataframe columns, whereas the Series.

How do I split a value in two columns in Python?

The simple division (/) operator is the first way to divide two columns. You will split the First Column with the other columns here. This is the simplest method of dividing two columns in Pandas. We will import Pandas and take at least two columns while declaring the variables.

How do I split a text column into two separate columns?

Select the cell or column that contains the text you want to split. Select Data > Text to Columns. In the Convert Text to Columns Wizard, select Delimited > Next. Select the Delimiters for your data.


1 Answers

You can split on space

df[['Name', 'Lastname']] = df.name.str.split(' ', expand = True)

    Id  name            Name    Lastname
0   1   James Cameron   James   Cameron
1   2   Martin Sheen    Martin  Sheen

EDIT: Handling the error 'Columns must be same length as key'. The data might have some names with more than one space, eg: George Martin Jr. In that case, one way is to split on space and use the first and the second string, ignoring third if it exists

df['First_Name'] = df.name.str.split(' ', expand = True)[0]
df['Last_Name'] = df.name.str.split(' ', expand = True)[1]
like image 130
Vaishali Avatar answered Sep 28 '22 06:09

Vaishali