Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract indices of cells meeting criteria from pandas DataFrame

I have a DataFrame in pandas like this:

  a b c
A 1 2 3 
B 4 5 6
C 7 8 9

I want to extract the pairs of index name and column name whose cell has a value bigger than 6.

In other words, I want to obtain

[["B","c"], ["C","a"], ["C","b"], ["C","c"]]

Is there any smart way to do this?

like image 778
nakaakist Avatar asked Oct 23 '25 11:10

nakaakist


1 Answers

You could use stack to flatten the frame to a series, use boolean indexing to select the terms you want, and finally turn the resulting index into a list:

s = df.stack()
ii = s[s >= 6].index.tolist()

For example:

>>> s = df.stack()
>>> s
A  a    1
   b    2
   c    3
B  a    4
   b    5
   c    6
C  a    7
   b    8
   c    9
dtype: int64
>>> s[s >= 6]
B  c    6
C  a    7
   b    8
   c    9
dtype: int64
>>> s[s >= 6].index
MultiIndex(levels=[[u'A', u'B', u'C'], [u'a', u'b', u'c']],
           labels=[[1, 2, 2, 2], [2, 0, 1, 2]])
>>> s[s >= 6].index.tolist()
[('B', 'c'), ('C', 'a'), ('C', 'b'), ('C', 'c')]

Note that (1) I'm using >= 6 because that matches your example, and (2) this is strictly a list of tuples not a list of lists like you asked for, but you can convert if you really want to.

like image 153
DSM Avatar answered Oct 26 '25 02:10

DSM



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!