Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get row, column indices of all non-NaN items in Pandas dataframe

How do I iterate over a dataframe like the following and return the non-NaN value locations as a tuple. i.e.

df:

     0    1    2
0    NaN NaN   1
1    1   NaN  NaN
2    NaN  2   NaN

I would get an output of [(0, 1), (2, 0), (1, 2)]. Would the best way be to do a nested-for loop? Or is there an easier way I'm unaware of through Pandas.

like image 979
SharpObject Avatar asked Dec 15 '22 07:12

SharpObject


1 Answers

Assuming you don't need in order, you could stack the nonnull values and work on index values.

In [26]: list(df[df.notnull()].stack().index)
Out[26]: [(0L, '2'), (1L, '0'), (2L, '1')]

In [27]: df[df.notnull()].stack().index
Out[27]:
MultiIndex(levels=[[0, 1, 2], [u'0', u'1', u'2']],
           labels=[[0, 1, 2], [2, 0, 1]])

Furthermore, using stack method, NaN are ignored anyway.

In [28]: list(df.stack().index)
Out[28]: [(0L, '2'), (1L, '0'), (2L, '1')]
like image 163
Zero Avatar answered Apr 28 '23 09:04

Zero