Sum values in a column based on strings from multiple columns in pandas/python

Question

I have a dataframe with 4 columns. 3 of these columns contain string values (people's names) and the 4th one has an int value (salary for a job done).

The string values are not unique either, the same string will show up several times in each column, but never more than once per row.

data = {
    'worker1': ['Sam', 'Jack', 'Matt', 'Paul', 'Tim'],
    'worker2': ['Alex', 'Amy', 'Sam', 'Alice', 'Amanda'], 
    'worker3': ['Alice', 'Aaron', 'Tony', 'Jack', 'Sam'],
    'earnings': [4564552, 4573547, 3567567, 6357653, 7648576]}

df = pd.DataFrame(data, columns = ['worker1', 'worker2', 'worker3', 'earnings'])

print(df)

worker1    worker2    worker3    earnings
'Sam'      'Alex'     'Alice'    4564552
'Jack'     'Amy'      'Aaron'    4573547
'Matt'     'Sam'      'Tony'     3567567
'Paul'     'Alice'    'Jack'     6357653
'Tim'      'Amanda'   'Sam'      7648576

So what I need is to sum all the earnings associated to the specific name, regardless if it shows on column1, 2 or 3. I'm not sure if I should use a groupby function for this, build a dictionary or go another route.

This would be what I'm trying to accomplish:

workers    total_earnings
Sam        16080695
Alex       4564552
Alice      10922205
Jack       10931200
Amy        4573547
Aaron      4573547
Matt       3567567
Tony       3567567
Paul       6357653
Tim        7648576
Amanda     7648576

I'm quite new to pandas so I'm at a place where I'm not familiar with which functions I can use for something like this. I've mostly tried to use a groupby function but that was a disaster.

Any help would be highly appreciated.

Jonas Byström · Accepted Answer

A bit lengthy, but does what you want:

>>> df1 = pd.concat([df.groupby('worker1').sum(), df.groupby('worker2').sum(), df.groupby('worker3').sum()])
>>> df1.groupby(df1.index).sum()
        earnings
Aaron    4573547
Alex     4564552
Alice   10922205
Amanda   7648576
Amy      4573547
Jack    10931200
Matt     3567567
Paul     6357653
Sam     15780695
Tim      7648576
Tony     3567567

Sum values in a column based on strings from multiple columns in pandas/python

Tags:

python

pandas

Tom

1 Answers

Jonas Byström

Recent Activity

Donate For Us

Sum values in a column based on strings from multiple columns in pandas/python

Tags:

python

pandas

Tom

1 Answers

Jonas Byström

Related questions

Recent Activity

Donate For Us