Pandas 'reduce' and 'accumulate' functions - incomplete implementation

Tags:

I would like to use reduce and accumulate functions in Pandas in a way similar to how they apply in native python with lists. In itertools and functools implementations, reduce and accumulate (sometimes called fold and cumulative fold in other languages) require a function with two arguments. In Pandas, there is no similar implementation. The function takes two parameters: f(accumulated_value,popped_value)

So, I have a list of binary variables and want to calculate the number of duration when we are in the 1 state:

In [1]: from itertools import accumulate
        import pandas as pd
        drawdown_periods = [0,1,1,1,0,0,0,1,1,1,1,0,1,1,0]

applying accumulate to this with the lambda function

lambda x,y: (x+y)*y

gives

In [2]: list(accumulate(drawdown_periods, lambda x,y: (x+y)*y))
Out[2]: [0, 1, 2, 3, 0, 0, 0, 1, 2, 3, 4, 0, 1, 2, 0]

counting the length of each drawdown_period.

Is there is a smart but quirky way to supply a lambda function with two arguments? I may be missing a trick here.

I know that there is a lovely recipe with groupby (see StackOverflow How to calculate consecutive Equal Values in Pandas/How to emulate itertools.groupby with a series/dataframe). I'll repeat it since it's so lovely:

In [3]: df = pd.DataFrame(data=drawdown_periods, columns=['dd'])
       df['dd'].groupby((df['dd'] != df['dd'].shift()).cumsum()).cumsum()
Out[3]:
    0     0
    1     1
    2     2
    3     3
    4     0
    5     0
    6     0
    7     1
    8     2
    9     3
    10    4
    11    0
    12    1
    13    2
    14    0
    Name: dd, dtype: int64

This is not the solution I want. I need a way of passing a two-parameter lambda function, to a pandas-native reduce/accumulate functions, since this will also work for many other functional programming recipes.

565

asked May 30 '18 11:05

NBF

2 Answers

You could get this to work with an efficiency penalty using numpy. In practice, you may be better writing ad hoc vectorized solutions.

Using np.frompyfunc:

s = pd.Series([0,1,1,1,0,0,0,1,1,1,1,0,1,1,0])
f = numpy.frompyfunc(lambda x, y: (x+y) * y, 2, 1)
f.accumulate(series.astype(object))

0     0
1     1
2     2
3     3
4     0
5     0
6     0
7     1
8     2
9     3
10    4
11    0
12    1
13    2
14    0
dtype: object

answered Sep 30 '22 05:09

hilberts_drinking_problem

What you are looking for would be a pandas method that would extract all objects from a Series, convert them to Python object, call a Python function and have an accumulator that is also a Python object.

This kind of behavior does not scale well when you have a lot of data, as there is a lot of time/memory overhead in wrapping the raw data in Python objects. Pandas methods try to work directly on the underlying (numpy) raw data, being able to process lots of data without having to wrap them in Python objects. The groupby+cumsum example you give is a clever way of avoiding the use of .apply and Python functions, which would be slower.

Nevertheless, you are of course free to do your own functional thing in Python if you don't care about the performance. As it's all Python anyway and there's no way of speeding it up on the pandas side, you can just write your own:

df["cev"] = list(accumulate(df.dd, lambda x,y:(x+y)*y))

answered Sep 30 '22 03:09

w-m

Related questions
                            
                                How to convert a spectrogram to 3d plot. Python
                            
                                Python PANDAS: Converting from pandas/numpy to dask dataframe/array
                            
                                Can't verify hashes for these requirements because we don't have a way to hash version control repositories
                            
                                Python: Pandas wrongly excluding column in groupby
                            
                                Type-hinting for the __init__ function from class meta information in Python
                            
                                Close session after use
                            
                                Shift interpolation does not give expected behaviour
                            
                                How to install tensorflow-1.2.1 in Docker which has alpine:3.7 as base image ? I am using python 3
                            
                                How to fix error: django.db.utils.NotSupportedError: URIs not supported
                            
                                Python: How to optimize function parameters?
                            
                                Use a method/function to format xlsx writer
                            
                                Solve a simple packing combination with dependencies
                            
                                How to group near-duplicate values in a pandas dataframe?
                            
                                Script directory not included in sys.path when a ._pth is in use
                            
                                Pickle AttributeError: Can't get attribute 'Wishart' on <module '__main__' from 'app.py'>
                            
                                Curious Modulus Operator (%) Result
                            
                                Building a nested tree-like structure in Python using recursive or iterative approach
                            
                                "Resolve Package Not Found" error in anaconda
                            
                                Should logger be an argument or a global variable?
                            
                                Youtube Analytics API returns 403 forbidden even if token is valid

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas 'reduce' and 'accumulate' functions - incomplete implementation

Tags:

python

pandas

functools

itertools

NBF

People also ask

2 Answers

hilberts_drinking_problem

w-m

Recent Activity

Donate For Us