I have a Pandas DataFrame with MultiIndexing
(Index col 1) (Index col 2) (Data col 1) ....
A a word1
a word2
b word3
B a word4
c word5
Now I want to count all the rows that have the same combination of Index column 1 and Index column 2. I've tried df.value_counts(), which gives the error 'DataFrame does not have a method value_counts(). If I use df.count(), I can only count for level=0 or level=1, not both at the same time (the level parameter does not seem to accept a list, even though I often see that used on stackoverflow).
Desired output: A a 2 A b 1 .. etc
[EDIT]: OK so @EdChum's comment solved the problem, but I am still wondering why the other stuff did not work? Specifically: why does value_counts not seem to be defined while it is part of the latest Pandas? Does this have anything to do with me using a Jupyter Notebook? Or do these things change a lot between Pandas versions?
You can groupby
on the indices of interest and call size
to return a count of the unique values:
In [4]:
df.groupby(level=[0,1]).size()
Out[4]:
(Index col 1) (Index col 2)
A a 2
b 1
B a 1
c 1
dtype: int64
value_counts
is a series method, it's not defined for a df which is why it didn't work
you can use the index.get_level_values to combine an index level with another column
grouped = df.groupby([df.index.get_level_values(0),'Num']).size()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With