How I can apply groupby two times on pandas data frame?

Question

I have pandas data frame with column 'year', 'month' and 'transaction id'. I want to get the transaction count of every month for every year. For ex my data is like:

year: {2015,2015,2015,2016,2016,2017}
month: {1,  1,   2,   2,   2,    1}
tid: {123,  343, 453, 675, 786, 332}

I want to get the output such that for every year I will get the number of transactions per month. For ex for year 2015 I will get the output:

month: [1,2]
count: [2,1]

I used groupby('year'). but after that how I can get the per month transaction count.

jezrael · Accepted Answer

You need groupby by both columns - year and month and then aggregate size:

year = [2015,2015,2015,2016,2016,2017]
month =  [1,  1,   2,   2,   2,    1]
tid = [123,  343, 453, 675, 786, 332]

df = pd.DataFrame({'year':year, 'month':month,'tid':tid})
print (df)
   month  tid  year
0      1  123  2015
1      1  343  2015
2      2  453  2015
3      2  675  2016
4      2  786  2016
5      1  332  2017

df1 = df.groupby(['year','month'])['tid'].size().reset_index(name='count')
print (df1)
   year  month  count
0  2015      1      2
1  2015      2      1
2  2016      2      2
3  2017      1      1

Owen · Answer

Another option for more complex tasks - suppose you want to group by "year" and a function applied to "tid" - e.g. a bucket categorization

def tidBucket(x):
   if x<300:             return "low"
   if (300<=x & x<700):  return "medium"
   if 700<=x:            return "high"

Then the above solution would not work. You could solve the problem by first grouping by year, then iterate over the contents of the groupby object with another groupby:

gb = df.groupby(by='year') #['tid'].size().reset_index(name='count')
for _,df1 in gb:
    df1.index = df1["tid"]
    df1 = df1.groupby(by=tidBucket)

Then aggregate as desired. Alternatively, you could create an additional "bucket" column

df["bucket"] = df["tid"].map(tidBucket)

and follow the @jezrael 's solution.

How I can apply groupby two times on pandas data frame?

Tags:

python

pandas

group-by

neha

Video Answer

2 Answers

jezrael

Owen

Recent Activity

Donate For Us

How I can apply groupby two times on pandas data frame?

Tags:

python

pandas

group-by

neha

Video Answer

2 Answers

jezrael

Owen

Related questions

Recent Activity

Donate For Us