Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: return number of occurrences by date

Assume I have the following DataFrame

dic = {"Date": ["04-Jan-16", "04-Jan-16", "04-Jan-16", "05-Jan-16", "05-Jan-16"], 
       "Col": ['A', 'A', 'B', 'A', 'B']}

 df = pd.DataFrame(dic)

 df

  Col       Date
0   A  04-Jan-16
1   A  04-Jan-16
2   B  04-Jan-16
3   A  05-Jan-16
4   B  05-Jan-16

I need to return a second DataFrame that lists the number of occurrences of A and B by day. i.e.

           A  B
Date           
04-Jan-16  2  1
05-Jan-16  1  1

I have a feeling this involves "groupby", but I dont know enough about it to get it into the format above ^

like image 774
measure_theory Avatar asked Apr 18 '17 19:04

measure_theory


2 Answers

You can use the pandas built in to do this pd.crosstab

pd.crosstab(df.Date, df.Col)

Col        A  B
Date           
04-Jan-16  2  1
05-Jan-16  1  1
like image 171
piRSquared Avatar answered Sep 30 '22 17:09

piRSquared


Use pivot_table method:

In [116]: df.pivot_table(index='Date', columns='Col', aggfunc='size') \
            .rename_axis(None, axis=1)
Out[116]:
           A  B
Date
04-Jan-16  2  1
05-Jan-16  1  1

or unstack():

In [121]: df.groupby(['Date', 'Col']).size().unstack('Col')
Out[121]:
Col        A  B
Date
04-Jan-16  2  1
05-Jan-16  1  1
like image 26
MaxU - stop WAR against UA Avatar answered Sep 30 '22 17:09

MaxU - stop WAR against UA