Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cartesian product of large iterators (itertools)

From a previous question I learned something interesting. If Python's itertools.product is fed a series of iterators, these iterators will be converted into tuples before the Cartesian product begins. Related questions look at the source code of itertools.product to conclude that, while no intermediate results are stored in memory, tuple versions of the original iterators are created before the product iteration begins.

Question: Is there a way to create an iterator to a Cartesian product when the (tuple converted) inputs are too large to hold in memory? Trivial example:

import itertools
A = itertools.permutations(xrange(100))
itertools.product(A)

A more practical use case would take in a series of (*iterables[, repeat]) like the original implementation of the function - the above is just an example. It doesn't look like you can use the current implementation of itertools.product, so I welcome in submission in pure python (though you can't beat the C backend of itertools!).

like image 863
Hooked Avatar asked Aug 23 '12 13:08

Hooked


People also ask

Is Itertools product faster than for loops?

That being said, the iterators from itertools are often significantly faster than regular iteration from a standard Python for loop.

What does Itertools Zip_longest return?

zip_longest() This iterator falls under the category of Terminating Iterators. It prints the values of iterables alternatively in sequence.

What does Itertools Islice do?

islice() In Python, Itertools is the inbuilt module that allows us to handle the iterators in an efficient way. They make iterating through the iterables like lists and strings very easily.


1 Answers

Here's an implementation which calls callables and iterates iterables, which are assumed restartable:

def product(*iterables, **kwargs):
    if len(iterables) == 0:
        yield ()
    else:
        iterables = iterables * kwargs.get('repeat', 1)
        it = iterables[0]
        for item in it() if callable(it) else iter(it):
            for items in product(*iterables[1:]):
                yield (item, ) + items

Testing:

import itertools
g = product(lambda: itertools.permutations(xrange(100)),
            lambda: itertools.permutations(xrange(100)))
print next(g)
print sum(1 for _ in g)
like image 190
ecatmur Avatar answered Nov 22 '22 18:11

ecatmur