Is groupby from pandas commutative?

Question

I would like to know if the rows selected by:

groupby(['a', 'b'])

are the same as the rows selected by:

groupby(['b', 'a'])

In this case the order of the rows doesn't matter.

Is there any case in which groupby does not fulfill the commutative property?

Celius Stingher · Accepted Answer

Per definition and the logic applied when using groupby in pandas, it will always be commutative:

A groupby operation involves some combination of splitting the object, applying a function, and combining the results.

This combination is linear hence commutative. The importance, is that when passing multiple by values, there will be an order in the new index values that should be kept in mind when addressing them.

From wikipedia's linear combination and commutative property:

In mathematics, a linear combination is an expression constructed from a set of terms by multiplying each term by a constant and adding the results. The idea that simple operations, such as the multiplication and addition of numbers, are commutative was for many years implicitly assumed.

jezrael · Answer

I think order for counts not matter, only after groupby get first columns/ levels in order like you have columns in list.

df = pd.DataFrame({
        'a':list('aaaaaa'),
         'b':[4,5,4,5,5,4],
         'c':[7,8,9,4,2,3],

})

Order of levels after groupby aggregation:

df1 = df.groupby(['a', 'b']).sum()
print (df1)
      c
a b    
a 4  19
  5  14

df2 = df.groupby(['b', 'a']).sum()
print (df2)
      c
b a    
4 a  19
5 a  14

And columns:

df3 = df.groupby(['a', 'b'], as_index=False).sum()
print (df3)
   a  b   c
0  a  4  19
1  a  5  14

df4 = df.groupby(['b', 'a'], as_index=False).sum()
print (df4)
   b  a   c
0  4  a  19
1  5  a  14

If use transormation for new column with same size like original result is same:

df['new1'] = df.groupby(['a', 'b'])['c'].transform('sum')
df['new2'] = df.groupby(['b', 'a'])['c'].transform('sum')
print (df)
   a  b  c  new1  new2
0  a  4  7    19    19
1  a  5  8    14    14
2  a  4  9    19    19
3  a  5  4    14    14
4  a  5  2    14    14
5  a  4  3    19    19

manuhortet · Answer

Yes, the final groups will always be the same.

Only difference is the order in which rows will be showed.

Is groupby from pandas commutative?

Tags:

python

pandas

commutativity

cristian hantig

3 Answers

Celius Stingher

jezrael

manuhortet

Recent Activity

Donate For Us

Is groupby from pandas commutative?

Tags:

python

pandas

commutativity

cristian hantig

3 Answers

Celius Stingher

jezrael

manuhortet

Related questions

Recent Activity

Donate For Us