Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way to call multiple reduce functions on an iterable in Python?

I'd like to run several reduce functions on an iterable in Python (2.7). An example would be calling min and max on an iterable of integers. But of course you can't call reduce(min, it) and reduce(max, it) on the same iterable, since it's exhausted after the first call. So you might think to do something like:

reduce(lambda a, b: (min(a[0], b[0]), max(a[1], b[1])), ((x, x) for x in it))

And you think, Hey, that's pretty nifty, so you generalize it into something like this:

from itertools import izip

def multireduce(iterable, *funcs):
    """:Return: The tuple resulting from calling ``reduce(func, iterable)`` for each `func` in `funcs`."""
    return reduce(lambda a, b: tuple(func(aa, bb) for func, aa, bb in izip(funcs, a, b)), ((item,) * len(funcs) for item in iterable))

(And you like unit tests, so you include something like this:)

import unittest
class TestMultireduce(unittest.TestCase):
    def test_multireduce(self):
        vecs = (
            ((1,), (min,), (1,)),
            (xrange(10), (min, max), (0, 9)),
            (xrange(101), (min, max, lambda x, y: x + y,), (0, 100, (100 * 101) // 2))
        )
        for iterable, funcs, expected in vecs:
            self.assertSequenceEqual(tuple(multireduce(iterable, *funcs)), expected)

But then you try it, and you realize it's really slow:

%timeit reduce(min, xrange(1000000)) ; reduce(max, xrange(1000000))
10 loops, best of 3: 140 ms per loop
%timeit reduce(lambda a, b: (min(a[0], b[0]), max(a[1], b[1])), ((x, x) for x in xrange(1000000)))
1 loop, best of 3: 682 ms per loop
%timeit multireduce(xrange(1000000), min, max)
1 loop, best of 3: 1.99 s per loop

Ouch. So then you come to Stack Overflow looking for Python optimization wisdom...

like image 870
Doctor J Avatar asked May 18 '26 05:05

Doctor J


1 Answers

Well, there's this, which kind of defeats the point of iterables...

def multireduce(iterable, *funcs):
    """:Return: The tuple resulting from calling ``reduce(func, iterable)`` for each `func` in `funcs`."""
    return tuple(imap(reduce, funcs, tee(iterable, len(funcs))))

But it is quite fast for my test case:

%timeit multireduce(xrange(1000000), min, max)
10 loops, best of 3: 166 ms per loop
like image 133
Doctor J Avatar answered May 20 '26 19:05

Doctor J



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!