How can I return the row index location of the last non-nan value for each column of the pandas data frame and return the locations as a pandas dataframe?
Use notnull
and specifically idxmax
to get the index values of the non NaN
values
In [22]:
df = pd.DataFrame({'a':[0,1,2,NaN], 'b':[NaN, 1,NaN, 3]})
df
Out[22]:
a b
0 0 NaN
1 1 1
2 2 NaN
3 NaN 3
In [29]:
df[pd.notnull(df)].idxmax()
Out[29]:
a 2
b 3
dtype: int64
EDIT
Actually as correctly pointed out by @Caleb you can use last_valid_index
which is designed for this:
In [3]:
df = pd.DataFrame({'a':[3,1,2,np.NaN], 'b':[np.NaN, 1,np.NaN, -1]})
df
Out[3]:
a b
0 3 NaN
1 1 1
2 2 NaN
3 NaN -1
In [6]:
df.apply(pd.Series.last_valid_index)
Out[6]:
a 2
b 3
dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With