I'm trying to find the most recent index with a value that is not 'NaN' relative to the current index. So, say I have a DataFrame with 'NaN' values like this:
A B C
0 2.1 5.3 4.7
1 5.1 4.6 NaN
2 5.0 NaN NaN
3 7.4 NaN NaN
4 3.5 NaN NaN
5 5.2 1.0 NaN
6 5.0 6.9 5.4
7 7.4 NaN NaN
8 3.5 NaN 5.8
If I am currently at index 4, I have the values:
A B C
4 3.5 NaN NaN
I want to know the last known value of 'B' relative to index 4, which is at index 1
:
A B C
1 5.1 -> 4.6 NaN
I know I can get a list of all indexes with NaN values using something like:
indexes = df.index[df['B'].apply(np.isnan)]
But this seems inefficient in a large database. Is there a way to tail
just the last one relative to the current index?
You may try something like this, convert the index
to a series that have the same NaN
values as column B
and then use ffill()
which carries the last non missing index forward for all subsequent NaN
s:
import pandas as pd
import numpy as np
df['Last_index_notnull'] = df.index.to_series().where(df.B.notnull(), np.nan).ffill()
df['Last_value_notnull'] = df.B.ffill()
df
Now at index 4
, you know the last non missing value is 4.6
and index is 1
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With