I have a dataframe like this:
date post
da1 a
da1 b
da2 a
da3 c
da1 d
da1 a
What I want to do is this:
date post total
da1 a 2
da1 b 1
da2 a 1
da3 c 1
da1 d 1
I've tried:
df.groupby(["date","post"]).count().sort_values(['index'], ascending=0)
And it sorts it in that order, but I cannot then access the date/post values via: df.date or df.post anymore as all the dates/posts become their own "keys" to the values in total.
It is imperative that I can access the values in the columns via their headers- how should I go about doing this?
I think you need:
print (df.groupby(["date","post"]).size().reset_index(name='total'))
date post total
0 da1 a 2
1 da1 b 1
2 da1 d 1
3 da2 a 1
4 da3 c 1
What is the difference between size and count in pandas?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With