Remove duplicates and add some column using python pandas

Question

Is it possible to do the followings using Python Pandas?

I have a csv file like the table A.

TABLE A
------------------------------------------------
Name               Email
------------------------------------------------
Hinckley Joel      [email protected]
Hinckley Joel      [email protected] 
Hinckley Joel      [email protected]
Joel Hinckley      [email protected]
Siegel Allison     [email protected]
Nielsen Tami       [email protected]
Nielsen Tami       [email protected]
...

I want to remove the rows with the duplicated name, also I want to add a new column "Secondary Email".
The secondary email will be the first email of the duplicated rows.

The final table I want to make is Table B.

TABLE B
-----------------------------------------------------------
Name               Email                   Secondary Email
-----------------------------------------------------------
Hinckley Joel      [email protected]          [email protected]
Siegel Allison     [email protected]
Nielsen Tami       [email protected]

As you can see from Table A and B, I want to consider as a same person even if the first and last name was replaced. (ex : "Hinckley Joel" and "Joel Hinckley")
Also, I want to take the secondary email (ex : [email protected]) and add it to the new column.

Thank you in advance.

Quang Hoang · Accepted Answer

This is pivoting with two columns, but you need to remove duplicates:

(df.drop_duplicates()
   .assign(col=lambda x: x.groupby("Name").cumcount())
   .pivot(index='Name', columns='col', values='Email')
   .add_prefix('Email_').reset_index()
)

Output:

col            Name            Email_0               Email_1
0     Hinckley Joel     [email protected]  [email protected]
1     Joel Hinckley  [email protected]                   NaN
2      Nielsen Tami     [email protected]       [email protected]
3    Siegel Allison  [email protected]                   NaN

Remove duplicates and add some column using python pandas

Tags:

python

pandas

Shi J

1 Answers

Quang Hoang

Recent Activity

Donate For Us

Remove duplicates and add some column using python pandas

Tags:

python

pandas

Shi J

1 Answers

Quang Hoang

Related questions

Recent Activity

Donate For Us