Pythonic cumulative map

Tags:

python-3.x

Is there a more pythonic way of doing the following:

def mysteryFunction( it, fun, val ):
    out = []
    for x in it:
        y,val = fun(x,val)
        out.append(y)
    return out,val

where it is iterable, fun is a function that takes two inputs and returns two outputs, and val is an initial value that gets "transformed" by each call to fun?

I am asking because I use map, zip, filter, reduce and list-comprehension on a regular basis, but I cannot express the previous function as a combination of those, and this is something that has come up several times now. Am I missing a hidden idiom, or is this just too niche to deserve one?

A concrete example is to calculate a duration in terms of (year, week, day, hour, minute, second) from a certain amount of seconds:

fac = (365*24*3600, 7*24*3600, 24*3600, 3600, 60, 1)
dur,rem = mysteryFunction( fac, lambda x,y: divmod(y,x), 234567 )

where dur is the duration tuple and rem corresponds to the final remainder (either zero or decimal here, depending on the type of the initial value). This is not just cherry-picked, there are many other examples, such as: fixed-step methods to integrate differential equations (iterable steps, stepper function, initial state); simulating a bounded random walk; tree processing across depth without recursion; etc.

566

asked Mar 07 '18 02:03

Jonathan H

1 Answers

This structure is similar to what the itertools.accumulate generator function was designed for. For example, your function might be used with a function like this:

def add2(x, y):
    return (x + y,) * 2 # Returning the same thing twice, new sum and accumulate sum the same

then called with:

mysteryFunction(range(5), add2, 0)

which would return:

([0, 1, 3, 6, 10], 10)

the accumulated sums of 0 through 4, and the final sum.

itertools.accumulate can do the same thing, but it's lazy (it returns each accumulated value as it's requested), and only works with two operand to single output functions; for this case, it ends up being simpler:

from itertools import accumulate
from operator import add

list(accumulate(range(5), add))

would produce the same list as mystery_function (and the second result would just be the last value of the list), but you could also use it lazily without storing the results in a list, e.g.:

for partialsum in accumulate(range(5), add):
    ... do stuff with partialsum ...

You could likely massage accumulate to handle a two input, two output function (or more precisely, discard the value you don't care about from the values output by accumulate), but most of the time I'd expect the second output to be an accumulated value to date, not really separate, so avoiding the second output would be cleaner.

For fun, a kind of terrible massage of your structure to match accumulate. Let's say you wanted to add a base value to each element in the input, but reduce the base by 1 each time. With your function, you'd do (for initial base of 10):

def addless(new, base):
    return base + new, base - 1

mysteryFunction(range(5), addless, 10)

which (thanks to passing it range that counteracts each decrease in base) produces ([10, 10, 10, 10, 10], 5). Similar code with accumulate might be:

def addless2(last_base, new):
    _, base = last_base
    return base + new, base - 1

then (with some ugliness because you can't specify an initial value for accumulate directly):

from itertools import accumulate, chain

base = 10

# chain used to provide initial value
accum = accumulate(chain(((None, base),), range(5)), addless2)

next(accum)   # Throw away first value that exists solely to make initial tuple

# Put the first value from each `tuple` in `out`, and keep the second value
# only for the last output, to preserve the final base
out, (*_, base) = zip(*accum)

which leaves vals as (10, 10, 10, 10, 10) and base as 5, just as in your code (apologies for the magic; zip with generalized, nested unpacking is both beautiful and horrifying all at once).

answered Oct 20 '22 18:10

ShadowRanger

Related questions
                            
                                invalid group reference when using re.sub()
                            
                                Join related models in django rest framework
                            
                                Numpy print a 1d array as a column
                            
                                Change all column names in chained operation
                            
                                parallel dask for loop slower than regular loop?
                            
                                How to find a letter in an Image with python
                            
                                Executing SQL query with psycopg2
                            
                                Mini batch training for inputs of variable sizes
                            
                                find the Hamming distance between two DNA strings
                            
                                Feature matching with flann in opencv
                            
                                Collectstatic - permission denied, pythonanywhere bash terminal
                            
                                How to load a SQLite3 extension in SQLAlchemy?
                            
                                Get data set as numpy array from TFRecordDataset
                            
                                Printing utf8 strings in Sublime Text's console with Windows
                            
                                How to feed sound as input to neural networks? [closed]
                            
                                How to train a network with multiple output layers in CNTK?
                            
                                Function that returns an accumulator in Python
                            
                                How to generate all combinations of a set of characters without repetitions?
                            
                                Why doesn't python3's print statement flush output when end keyword is specified?
                            
                                Python: Loop to open multiple folders and files in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With