Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I merge duplicate rows into one on a DataFrame when they have different values

I have a DataFrame like the following:

ID  NAME    TEL_1   TEL_2   TEL_3
1   John    123456  754987  465317
1   John    465987          465987
1   John            546783
2   Robert  264687  
2   Robert          462531  
3   William 432645  765346  875137

I need to merge the rows that have the same ID, saving the phone values, like this:

ID  NAME    TEL_1   TEL_2   TEL_3   TEL_4   TEL_5   TEL_6
1   John    123456  754987  465317  465987  465987  546783
2   Robert  264687  462531  
3   William 432645  765346  875137  
like image 693
Ramiro Federico Khouri Avatar asked Feb 04 '26 18:02

Ramiro Federico Khouri


1 Answers

You can set your ID and NAME columns as index, use groupby on these and then concat the respective rows horizontally to get your desired output:

persons = df.set_index(['ID', 'NAME']).groupby(level=['ID', 'NAME'])
new_df =pd.DataFrame()
for details, phones in persons:
    person_phones = pd.concat([row for i, row in phones.iterrows()]).to_frame()
    person_phones.index = ['TEL_{}'.format(i) for i in range(len(person_phones))]
    new_df = pd.concat([new_df, person_phones], axis=1)

new_df.transpose().reset_index().rename(columns={'level_0': 'ID', 'level_1': 'NAME'})

to get:

   ID     NAME   TEL_0   TEL_1   TEL_2   TEL_3   TEL_4   TEL_5  TEL_6   TEL_7  \
0   1     John  123456  754987  465317  465987     NaN  465987    NaN  546783   
1   2   Robert  264687     NaN     NaN     NaN  462531     NaN    NaN     NaN   
2   3  William  432645  765346  875137     NaN     NaN     NaN    NaN     NaN   

   TEL_8  
0    NaN  
1    NaN  
2    NaN 
like image 176
Stefan Avatar answered Feb 06 '26 10:02

Stefan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!