Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I find duplicate indices in a DataFrame?

I have a pandas DataFrame with a multi-level index ("instance" and "index"). I want to find all the first-level ("instance") index values which are non-unique and to print out those values.

My frame looks like this:

                     A
instance  index      
      a       1      10
              2      12
              3      4
      b       1      12
              2      5
              3      2 
      b       1      12
              2      5
              3      2

I want to find "b" as the duplicate 0-level index and print its value ("b") out.

like image 638
Pat Patterson Avatar asked Jan 18 '15 20:01

Pat Patterson


1 Answers

You can use the get_duplicates() method:

>>> df.index.get_level_values('instance').get_duplicates()
[0, 1]

(In my example data 0 and 1 both appear multiple times.)

The get_level_values() method can accept a label (such as 'instance') or an integer and retrieves the relevant part of the MultiIndex.

like image 91
Alex Riley Avatar answered Oct 23 '22 01:10

Alex Riley