pandas dataframe groupby by column position

Question

I have a function that does group by on a pandas dataframe. The problem is my dataframe can have variable number of columns. I want to aggregate: sum the last column by the first column. The name of the last column is different, but, the name of the first column is fixed.

How could I achieve the group by? I tried using iloc and by getting the column name of the last column using df.columns[-1], but, none of these tricks seem to work.

Are there any better ways to achieve this than changing the last column name to some common value?

Psidom · Accepted Answer

df.groupby(df.columns[0])[df.columns[-1]].sum() should work.

Example:

df = pd.DataFrame({
    'a': [1,1,2,2],
    'b': [1,2,3,4]
})

df.groupby(df.columns[0])[df.columns[-1]].sum()
#a
#1    3
#2    7
#Name: b, dtype: int64

jezrael · Answer

Simply use Series selected by iloc, data borrowed by @Psidom:

s = df.iloc[:, -1].groupby(df.iloc[:, 0]).sum()
print (s)
a
1    3
2    7
Name: b, dtype: int64

pandas dataframe groupby by column position

Tags:

python-3.x

pandas

pandas-groupby

add787

2 Answers

Psidom

jezrael

Recent Activity

Donate For Us

pandas dataframe groupby by column position

Tags:

python-3.x

pandas

pandas-groupby

add787

2 Answers

Psidom

jezrael

Related questions

Recent Activity

Donate For Us