I'm trying to find the most recent index with a value that is not 'NaN' relative to the current index. So, say I have a DataFrame with 'NaN' values like this:
       A       B       C
0    2.1     5.3     4.7
1    5.1     4.6     NaN
2    5.0     NaN     NaN
3    7.4     NaN     NaN
4    3.5     NaN     NaN
5    5.2     1.0     NaN
6    5.0     6.9     5.4
7    7.4     NaN     NaN
8    3.5     NaN     5.8
If I am currently at index 4, I have the values:
       A       B       C
4    3.5     NaN     NaN
I want to know the last known value of 'B' relative to index 4, which is at index 1:
       A       B       C
1    5.1   -> 4.6    NaN
I know I can get a list of all indexes with NaN values using something like:
indexes = df.index[df['B'].apply(np.isnan)]
But this seems inefficient in a large database. Is there a way to tail just the last one relative to the current index?
You may try something like this, convert the index to a series that have the same NaN values as column B and then use ffill() which carries the last non missing index forward for all subsequent NaNs:
import pandas as pd
import numpy as np
df['Last_index_notnull'] = df.index.to_series().where(df.B.notnull(), np.nan).ffill()
df['Last_value_notnull'] = df.B.ffill()
df

Now at index 4, you know the last non missing value is 4.6 and index is 1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With