Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

combine similar rows into single row in pandas [duplicate]

I have below Dataframe nbr

||Postal_Code|Borough|Neighborhood|
|0|M3A|North York|Parkwoods|
|1|M4A|North York|Victoria Village|
|2|M5A|Downtown Toronto|Harbourfront|
|3|M5A|Downtown Toronto|Regent Park|
|4|M6A|North York|Lawrence Heights|
|5|M6A|North York|Lawrence Manor|
|6|M7A|Queen’s Park|Queen’s Park|

I want to run Python code such that the rows 4 and 5 should merge into 1 row and give back result like below: (I have tried groupby and agg methods but they do not work here)

||Postal_Code|Borough|Neighborhood|
|0|M3A|North York|Parkwoods|
|1|M4A|North York|Victoria Village|
|2|M5A|Downtown Toronto|Harbourfront|
|3|M5A|Downtown Toronto|Regent Park|
|4|M6A|North York|Lawrence Heights , Lawrence Manor|
|5|M7A|Queen’s Park|Queen’s Park|

Code below :

nbr1.index = pd.RangeIndex(len(nbr1.index))
More than one neighborhood can exist in one postal code area.

for row_index,row in nbr1.iterrows():
    if(nbr1.loc[row_index,[‘Postal_Code’]].values.astype(‘str’) == nbr1.loc[row_index + 1,[‘Postal_Code’]].values.astype(‘str’)):
        print(‘inside same Postal code’)
        print(nbr1.loc[row_index,[‘Postal_Code’]].values.astype(‘str’))
        print(nbr1.loc[row_index + 1,[‘Postal_Code’]].values.astype(‘str’))

    if(nbr1.loc[row_index,['Borough']].values.astype('str') == nbr1.loc[row_index + 1,['Borough']].values.astype('str')):
        print('inside same Borough')
        print(nbr1.loc[row_index,['Borough']].values.astype('str'))
        print(nbr1.loc[row_index + 1,['Borough']].values.astype('str'))
        print(nbr1.loc[row_index,['Neighborhood']].values.astype('str'))
        print(nbr1.loc[row_index + 1,['Neighborhood']].values.astype('str'))
        print('Adding')
        nbr1[row_index,['Neighborhood']] = nbr1.loc[row_index,['Neighbourhood']].values.astype('str').apply(lambda x: '-'.join(x +1), axis=1)
like image 320
Mansi Gupta Avatar asked Dec 04 '25 13:12

Mansi Gupta


1 Answers

You can use groupby and agg

df.groupby('Postal_Code').agg({'Borough':'first',
                               'Neighborhood': ', '.join}).reset_index()

Output:

  Postal_Code   Borough            Neighborhood
0   M3A         North York          Parkwoods
1   M4A         North York         Victoria Village
2   M5A       Downtown Toronto   Harbourfront, Regent Park
3   M6A         North York       Lawrence Heights, Lawrence Manor
4   M7A         Queen’s Park        Queen’s Park
like image 148
Abhi Avatar answered Dec 07 '25 13:12

Abhi



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!