I have a pandas data frame similar to:
ColA ColB
1    1
1    1
1    1
1    2
1    2
2    1
3    2
I want an output that has the same function as Counter. I need to know how many time each row appears (with all of the columns being the same.
In this case the proper output would be:
ColA ColB Count
1    1    3
1    2    2
2    1    1
3    2    1
I have tried something of the sort:
df.groupby(['ColA','ColB']).ColA.count()
but this gives me some ugly output I am having trouble formatting
You can use the nunique() function to count the number of unique values in a pandas DataFrame.
You can get unique values in column (multiple columns) from pandas DataFrame using unique() or Series. unique() functions. unique() from Series is used to get unique values from a single column and the other one is used to get from multiple columns.
You can use size with reset_index:
print df.groupby(['ColA','ColB']).size().reset_index(name='Count')
   ColA  ColB  Count
0     1     1      3
1     1     2      2
2     2     1      1
3     3     2      1
                        I only needed to count the unique rows and have used the DataFrame.drop_duplicates alternative as below:
len(df[['ColA', 'ColB']].drop_duplicates())
It was twice as fast on my data than len(df.groupby(['ColA', 'ColB'])).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With