Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding when a value in a pandas.Series crosses/reaches a threshold

Tags:

python

pandas

Consider the following series

s = pd.Series([0,1,2,3,4,1,5,4,3,2,1])

Is there an easy way of knowing how many times the 2 value is reached/crossed (without the obvious iterating solution)?

The expected result for the example above should be 4 (the 2 line is crossed up or down 4 times in the series).

Edit: updated example case

like image 658
joao Avatar asked Jul 08 '15 10:07

joao


People also ask

What does .values do in pandas?

Definition and Usage The values property returns all values in the DataFrame. The return value is a 2-dimensional array with one array for each row.

How do you find the range of a data set in pandas?

In pandas, we can determine Period Range with Frequency with the help of period_range(). pandas.

How do I check pandas series value?

isin() function check whether values are contained in Series. It returns a boolean Series showing whether each element in the Series matches an element in the passed sequence of values exactly.

What is LOC method in pandas?

loc[source] Access a group of rows and columns by label(s) or a boolean array. .loc[] is primarily label based, but may also be used with a boolean array. Allowed inputs are: A single label, e.g. 5 or 'a' , (note that 5 is interpreted as a label of the index, and never as an integer position along the index).


1 Answers

This is easily acheiveable with the Series.shift method. Since you only need to look one forward to know if the number has crossed or not.

s = pd.Series([0,1,2,3,4,1,5,4,3,2,1])
df = pd.DataFrame({'s':s})
df['next_s'] = df.s.shift(-1)
line = 2

df
    s  next_s
0   0       1
1   1       2
2   2       3
3   3       4
4   4       1
5   1       5
6   5       4
7   4       3
8   3       2
9   2       1
10  1     NaN

Now you can use a simple vectorizable conditional statement

df['cross'] = (
    ((df.s >= line) & (df.next_s < line)) |
    ((df.next_s > line) & (df.s <= line)) |
    (df.s == line))

df
    s  next_s  cross
0   0       1  False
1   1       2  False
2   2       3   True
3   3       4  False
4   4       1   True
5   1       5   True
6   5       4  False
7   4       3  False
8   3       2  False
9   2       1   True
10  1     NaN  False

Now it's quite easy to sum up the booleans to get the count:

df.cross.sum()
4
like image 77
firelynx Avatar answered Nov 09 '22 18:11

firelynx