Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Summing up multiple values in single row

Given a dataframe such as this, is it possible to add up the countries specific value even if there are multiple countries in one row? For example, for the 1st row Japan and USA are present, so i would want the value to be Japan=1 USA=1

import pandas as pd
import numpy as np

countries=["Europe","USA","Japan"]
data= {'Employees':[1,2,3,4],
    'Country':['Japan;USA','USA;Europe',"Japan","Europe;Japan"]}
df=pd.DataFrame(data)
print(df)

patt = '(' + '|'.join(countries) + ')'
grp = df.Country.str.extractall(pat=patt).values
new_df = df.groupby(grp).agg({'Employees': sum})
print(new_df)

I have tried this but it returns a grouper and axis must be same length error. Is this the correct way to do it?

ValueError                                Traceback (most recent call last)
<ipython-input-81-53e8e9f0f301> in <module>()
     10 patt = '(' + '|'.join(countries) + ')'
     11 grp = df.Country.str.extractall(pat=patt).values
---> 12 new_df = df.groupby(grp).agg({'Employees': sum})
     13 print(new_df)

    4 frames
    /usr/local/lib/python3.7/dist-packages/pandas/core/groupby/grouper.py in _convert_grouper(axis, grouper)
        842     elif isinstance(grouper, (list, Series, Index, np.ndarray)):
        843         if len(grouper) != len(axis):
    --> 844             raise ValueError("Grouper and axis must be same length")
        845         return grouper
        846     else:

Thus, i would like the end result to be Japan: 8 Europe:6 USA:3

Thanks

like image 774
brezz Avatar asked Mar 10 '26 14:03

brezz


1 Answers

Could you please try following, written and tested with shown samples. Using split, explode, groupby functions of Pandas.

df['Country'] = df['Country'].str.split(';')
df.explode('Country').groupby('Country')['Employees'].sum()

Output will be as follows:

Country
Eurpoe  6
Japan   8
USA     3
Name: Employees, dtype: int64

Explanation: Simple explanation would be:

  • Firstly splitting Country column of DataFrame by ; and saving results into same column.
  • Then using explode on Country column then using groupby on Country column and using sum function on it to get its sum in Employees column.
like image 77
RavinderSingh13 Avatar answered Mar 13 '26 03:03

RavinderSingh13



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!