I'm trying to count consecutive up days in equity return data; so if a positive day is 1 and a negative is 0, a list y=[0,0,1,1,1,0,0,1,0,1,1]
should return z=[0,0,1,2,3,0,0,1,0,1,2]
.
I've come to a solution which has few lines of code, but is very slow:
import pandas y = pandas.Series([0,0,1,1,1,0,0,1,0,1,1]) def f(x): return reduce(lambda a,b:reduce((a+b)*b,x) z = pandas.expanding_apply(y,f)
I'm guessing I'm looping through the whole list y
too many times. Is there a nice Pythonic way of achieving what I want while only going through the data once? I could write a loop myself but wondering if there's a better way.
Using the size() or count() method with pandas. DataFrame. groupby() will generate the count of a number of occurrences of data present in a particular column of the dataframe.
>>> y = pandas.Series([0,0,1,1,1,0,0,1,0,1,1])
The following may seem a little magical, but actually uses some common idioms: since pandas
doesn't yet have nice native support for a contiguous groupby
, you often find yourself needing something like this.
>>> y * (y.groupby((y != y.shift()).cumsum()).cumcount() + 1) 0 0 1 0 2 1 3 2 4 3 5 0 6 0 7 1 8 0 9 1 10 2 dtype: int64
Some explanation: first, we compare y
against a shifted version of itself to find when the contiguous groups begin:
>>> y != y.shift() 0 True 1 False 2 True 3 False 4 False 5 True 6 False 7 True 8 True 9 True 10 False dtype: bool
Then (since False == 0 and True == 1) we can apply a cumulative sum to get a number for the groups:
>>> (y != y.shift()).cumsum() 0 1 1 1 2 2 3 2 4 2 5 3 6 3 7 4 8 5 9 6 10 6 dtype: int32
We can use groupby
and cumcount
to get us an integer counting up in each group:
>>> y.groupby((y != y.shift()).cumsum()).cumcount() 0 0 1 1 2 0 3 1 4 2 5 0 6 1 7 0 8 0 9 0 10 1 dtype: int64
Add one:
>>> y.groupby((y != y.shift()).cumsum()).cumcount() + 1 0 1 1 2 2 1 3 2 4 3 5 1 6 2 7 1 8 1 9 1 10 2 dtype: int64
And finally zero the values where we had zero to begin with:
>>> y * (y.groupby((y != y.shift()).cumsum()).cumcount() + 1) 0 0 1 0 2 1 3 2 4 3 5 0 6 0 7 1 8 0 9 1 10 2 dtype: int64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With