Counting consecutive positive values in Python/pandas array

Tags:

pandas

I'm trying to count consecutive up days in equity return data; so if a positive day is 1 and a negative is 0, a list y=[0,0,1,1,1,0,0,1,0,1,1] should return z=[0,0,1,2,3,0,0,1,0,1,2].

I've come to a solution which has few lines of code, but is very slow:

import pandas y = pandas.Series([0,0,1,1,1,0,0,1,0,1,1])  def f(x):     return reduce(lambda a,b:reduce((a+b)*b,x)  z = pandas.expanding_apply(y,f)

I'm guessing I'm looping through the whole list y too many times. Is there a nice Pythonic way of achieving what I want while only going through the data once? I could write a loop myself but wondering if there's a better way.

527

asked Dec 23 '14 19:12

alex314159

1 Answers

>>> y = pandas.Series([0,0,1,1,1,0,0,1,0,1,1])

The following may seem a little magical, but actually uses some common idioms: since pandas doesn't yet have nice native support for a contiguous groupby, you often find yourself needing something like this.

>>> y * (y.groupby((y != y.shift()).cumsum()).cumcount() + 1) 0     0 1     0 2     1 3     2 4     3 5     0 6     0 7     1 8     0 9     1 10    2 dtype: int64

Some explanation: first, we compare y against a shifted version of itself to find when the contiguous groups begin:

>>> y != y.shift() 0      True 1     False 2      True 3     False 4     False 5      True 6     False 7      True 8      True 9      True 10    False dtype: bool

Then (since False == 0 and True == 1) we can apply a cumulative sum to get a number for the groups:

>>> (y != y.shift()).cumsum() 0     1 1     1 2     2 3     2 4     2 5     3 6     3 7     4 8     5 9     6 10    6 dtype: int32

We can use groupby and cumcount to get us an integer counting up in each group:

>>> y.groupby((y != y.shift()).cumsum()).cumcount() 0     0 1     1 2     0 3     1 4     2 5     0 6     1 7     0 8     0 9     0 10    1 dtype: int64

Add one:

>>> y.groupby((y != y.shift()).cumsum()).cumcount() + 1 0     1 1     2 2     1 3     2 4     3 5     1 6     2 7     1 8     1 9     1 10    2 dtype: int64

And finally zero the values where we had zero to begin with:

>>> y * (y.groupby((y != y.shift()).cumsum()).cumcount() + 1) 0     0 1     0 2     1 3     2 4     3 5     0 6     0 7     1 8     0 9     1 10    2 dtype: int64

answered Sep 26 '22 01:09

DSM

Related questions
                            
                                How to return 400 (Bad Request) on Flask?
                            
                                Django: Adding CSS classes when rendering form fields in a template
                            
                                FreqDist with NLTK
                            
                                Automatic indentation for Python in Notepad++
                            
                                Is non-blocking Redis pubsub possible?
                            
                                Pyinstaller "Failed to execute script pyi_rth_pkgres" and missing packages
                            
                                How to sort python list of strings of numbers
                            
                                ImportError: cannot import name SignedJwtAssertionCredentials
                            
                                Detect If Item is the Last in a List [duplicate]
                            
                                python optimized mode
                            
                                Django Model set foreign key to a field of another Model
                            
                                and / or operators return value [duplicate]
                            
                                django-rest-framework how to make model serializer fields required
                            
                                Python: How do I convert an array of strings to an array of numbers? [duplicate]
                            
                                Python list filtering with arguments
                            
                                Install pip on pypy
                            
                                How to get the least common element in a list?
                            
                                "ValueError: unknown locale: UTF-8" when importing pandas in python 2.7 [duplicate]
                            
                                How to render my TextArea with WTForms?
                            
                                No attribute 'SMTP', error when trying to send email in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Counting consecutive positive values in Python/pandas array

Tags:

python

pandas

alex314159

People also ask

1 Answers

DSM

Recent Activity

Donate For Us