Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to .value_count() rows while taking into account other columns?

I have a dataframe like this:

  date   post 
   da1     a 
   da1     b
   da2     a
   da3     c
   da1     d
   da1     a

What I want to do is this:

    date post total
     da1   a     2
     da1   b     1
     da2   a     1
     da3   c     1
     da1   d     1

I've tried:

    df.groupby(["date","post"]).count().sort_values(['index'], ascending=0)

And it sorts it in that order, but I cannot then access the date/post values via: df.date or df.post anymore as all the dates/posts become their own "keys" to the values in total.

It is imperative that I can access the values in the columns via their headers- how should I go about doing this?

like image 220
raph Avatar asked Jan 26 '26 16:01

raph


1 Answers

I think you need:

print (df.groupby(["date","post"]).size().reset_index(name='total'))
  date post  total
0  da1    a      2
1  da1    b      1
2  da1    d      1
3  da2    a      1
4  da3    c      1

What is the difference between size and count in pandas?

like image 115
jezrael Avatar answered Jan 28 '26 07:01

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!