I've got a problem that shouldn't be that difficult but it's stumping me. There has to be an easy way to do it. I have a series from a dataframe that looks like this:
value
2001-01-04 0.134
2001-01-05 Nan
2001-01-06 Nan
2001-01-07 0.032
2001-01-08 Nan
2001-01-09 0.113
2001-01-10 Nan
2001-01-11 Nan
2001-01-12 0.112
2001-01-13 Nan
2001-01-14 Nan
2001-01-15 0.136
2001-01-16 Nan
2001-01-17 Nan
Iterating from bottom to top, I need to find the index of the value that is greater than 0.100 at the earliest date where the next earliest date would be less than 0.100.
So in the series above, I want to find the index of the value 0.113 which is 2001-01-09. The next earlier value is below 0.100 (0.031 on 2001-01-07). The two later values are greater than 0.100 but I want the index of the earliest value > 0.100 following a value less than than threshold iterating bottom to top.
The only way I can think of doing this is reversing the series, iterating to the first (last) value, checking if it is > 0.100, then again iterating to the next earlier value, and checking it to see if it's less than 0.100. If it isn't I'm done. If it > 0.100 I have to iterate again and test the earlier number.
Surely there is a non-messy way to do this I'm not seeing that avoids all this stepwise iteration.
Thanks in advance for you help.
To get the index of a Pandas DataFrame, call DataFrame. index property. The DataFrame. index property returns an Index object representing the index of this DataFrame.
Use pandas DataFrame. iloc[] & DataFrame. loc[] to select rows by integer Index and by row indices respectively. iloc[] operator can accept single index, multiple indexes from the list, indexes by a range, and many more.
In order to access the series element refers to the index number. Use the index operator [ ] to access an element in a series. The index must be an integer. In order to access multiple elements from a series, we use Slice operation.
You're essentially looking for two conditions. For the first condition, you want the given value to be greater than 0.1:
df['value'].gt(0.1)
For the second condition, you want the previous non-null value to be less than 0.1:
df['value'].ffill().shift().lt(0.1)
Now, combine the two conditions with the and operator, reverse the resulting Boolean indexer, and use idxmax
to find the the first (last) instance where your condition holds:
(df['value'].gt(0.1) & df['value'].ffill().shift().lt(0.1))[::-1].idxmax()
Which gives the expected index value.
The above method assumes that at least one value satisfies the situation you've described. If it's possible that your data may not satisfy your situation you may want to use any
to verify that a solution exists:
# Build the condition.
cond = (df['value'].gt(0.1) & df['value'].ffill().shift().lt(0.1))[::-1]
# Check if the condition is met anywhere.
if cond.any():
idx = cond.idxmax()
else:
idx = ???
In you're question, you've specified both inequalities to be strict. What happens for a value exactly equal to 0.1? You may want to change one of the gt
/lt
to ge
/le
to account for this.
Bookkeepping
# making sure `nan` are actually `nan`
df.value = pd.to_numeric(df.value, 'coerce')
# making sure strings are actually dates
df.index = pd.to_datetime(df.index)
plan
dropna
sort_index
0.1
diff
diff
- Your scenario happens when we go from < .1
to > .1
. In this case, diff
will be -1
idxmax
- find the first -1
df.value.dropna().sort_index().lt(.1).astype(int).diff().eq(-1).idxmax()
2001-01-09 00:00:00
Correction do account for flaw pointed out by @root.
diffs = df.value.dropna().sort_index().lt(.1).astype(int).diff().eq(-1)
diffs.idxmax() if diffs.any() else pd.NaT
editorial
This question highlights an important SO dynamic. We that answer questions often do so by editing our questions until they are in a satisfactory state. I have observed that those of us who answer pandas
questions are generally very helpful to each other as well to those who ask questions.
In this post, I was well informed by @root and subsequently changed my post to reflect the added information. That alone makes @root's post very useful in addition to the other great information they provided.
Please recognize both posts and up vote as many useful posts as you can.
Thx
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With