Group by value of sum of columns with Pandas

Question

I got lost in Pandas doc and features trying to figure out a way to groupby a DataFrame by the values of the sum of the columns.

for instance, let say I have the following data :

In [2]: dat = {'a':[1,0,0], 'b':[0,1,0], 'c':[1,0,0], 'd':[2,3,4]}

In [3]: df = pd.DataFrame(dat)

In [4]: df
Out[4]: 
   a  b  c  d
0  1  0  1  2
1  0  1  0  3
2  0  0  0  4

I would like columns a, b and c to be grouped since they all have their sum equal to 1. The resulting DataFrame would have columns labels equals to the sum of the columns it summed. Like this :

Any idea to put me in the good direction ? Thanks in advance !

TomAugspurger · Accepted Answer

Here you go:

In [57]: df.groupby(df.sum(), axis=1).sum()
Out[57]: 
   1  9
0  2  2
1  1  3
2  0  4

[3 rows x 2 columns]

df.sum() is your grouper. It sums over the 0 axis (the index), giving you the two groups: 1 (columns a, b, and, c) and 9 (column d) . You want to group the columns (axis=1), and take the sum of each group.

Group by value of sum of columns with Pandas

Tags:

python

pandas

dataframe

group-by

mazieres

1 Answers

TomAugspurger

Recent Activity

Donate For Us

Group by value of sum of columns with Pandas

Tags:

python

pandas

dataframe

group-by

mazieres

1 Answers

TomAugspurger

Related questions

Recent Activity

Donate For Us