Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas dataframe selecting the nan indexes

Tags:

python

pandas

I have a dataframe df with the following:

In [10]: df.index.unique()
Out[10]: array([u'DC', nan, u'BS', u'AB', u'OA'], dtype=object)

I can easily select out df.ix["DC"], df.ix["BS"], etc. But I'm having trouble selecting the nan indexes.

df.ix[nan], df.ix["nan"], df.ix[np.nan] all won't work.

How do I select the rows with nan as the index?

like image 789
lessthanl0l Avatar asked Aug 27 '14 20:08

lessthanl0l


1 Answers

One way would be to use df.index.isnull() to identify the location of the NaNs:

In [218]: df = pd.DataFrame({'Date': [0, 1, 2, 0, 1, 2], 'Name': ['A', 'B', 'C', 'A', 'B', 'C'], 'val': [0, 1, 2, 3, 4, 5]}, index=['DC', np.nan, 'BS', 'AB', 'OA', np.nan]); df
Out[218]: 
     Date Name  val
DC      0    A    0
NaN     1    B    1
BS      2    C    2
AB      0    A    3
OA      1    B    4
NaN     2    C    5

In [219]: df.index.isnull()
Out[219]: array([False,  True, False, False, False,  True], dtype=bool)

Then you could select those rows using df.loc:

In [220]: df.loc[df.index.isnull()]
Out[220]: 
     Date Name  val
NaN     1    B    1
NaN     2    C    5

Note: My original answer used pd.isnull(df.index) instead of Zero's suggestion, df.index.isnull(). It is better to use df.index.isnull() because for types of Indexes which can not hold NaNs, such as Int64Index and RangeIndex, the isnull method returns an array of all False values immediately instead of mindlessly checking each item in the index for NaN values.

like image 97
unutbu Avatar answered Oct 06 '22 00:10

unutbu