Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas multi-index count occurrences

I have a Pandas DataFrame with MultiIndexing

(Index col 1) (Index col 2) (Data col 1) ....
A               a            word1
                a            word2
                b            word3
B               a            word4
                c            word5

Now I want to count all the rows that have the same combination of Index column 1 and Index column 2. I've tried df.value_counts(), which gives the error 'DataFrame does not have a method value_counts(). If I use df.count(), I can only count for level=0 or level=1, not both at the same time (the level parameter does not seem to accept a list, even though I often see that used on stackoverflow).

Desired output: A a 2 A b 1 .. etc

[EDIT]: OK so @EdChum's comment solved the problem, but I am still wondering why the other stuff did not work? Specifically: why does value_counts not seem to be defined while it is part of the latest Pandas? Does this have anything to do with me using a Jupyter Notebook? Or do these things change a lot between Pandas versions?

like image 808
Celebrian Avatar asked Dec 16 '16 11:12

Celebrian


2 Answers

You can groupby on the indices of interest and call size to return a count of the unique values:

In [4]:
df.groupby(level=[0,1]).size()

Out[4]:
(Index col 1)  (Index col 2)
A              a                2
               b                1
B              a                1
               c                1
dtype: int64

value_counts is a series method, it's not defined for a df which is why it didn't work

like image 63
EdChum Avatar answered Sep 27 '22 23:09

EdChum


you can use the index.get_level_values to combine an index level with another column

 grouped = df.groupby([df.index.get_level_values(0),'Num']).size()
like image 24
Golden Lion Avatar answered Sep 27 '22 23:09

Golden Lion