Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"sorted 1-d iterator" based on "2-d iterator" (Cartesian product of iterators)

I'm looking for a clean way to do this in Python:

Let's say I have two iterators "iter1" and "iter2": perhaps a prime number generator, and itertools.count(). I know a priori that both are are infinite and monotonically increasing. Now I want to take some simple operation of two args "op" (perhaps operator.add or operator.mul), and calculate every element of the first iterator with every element of the next, using said operation, then yield them one at a time, sorted. Obviously, this is an infinite sequence itself. (As mentioned in comment by @RyanThompson: this would be called the Cartesian Product of these sequences... or, more exactly, the 1d-sort of that product.)

What is the best way to:

  • wrap-up "iter1", "iter2", and "op" in an iterable that itself yields the values in monotonically increasing output.

Allowable simplifying assumptions:

  • If it helps, we can assume op(a,b) >= a and op(a,b) >= b.
  • If it helps, we can assume op(a,b) > op(a,c) for all b > c.

Also allowable:

  • Also acceptable would be an iterator that yields values in "generally increasing" order... by which I mean the iterable could occasionally give me a number less than a previous one, but it would somehow make "side information" available (as via a method on the object) that would say "I'm not guarenteeing the next value I give you will be greater than the one I just gave you, but I AM SURE that all future values will at least be greater than N.".... and "N" itself is monotonically increasing.

The only way I can think to do this is a sort of "diagonalization" process, where I keep an increasing number of partially processed iterables around, and "look ahead" for the minimum of all the possible next() values, and yield that. But this weird agglomeration of a heapq and a bunch of deques just seems outlandish, even before I start to code it.

Please: do not base your answer on the fact that my examples mentioned primes or count().... I have several uses for this very concept that are NOT related to primes and count().


UPDATE: OMG! What a great discussion! And some great answers with really thorough explanations. Thanks so much. StackOverflow rocks; you guys rock.

I'm going to delve in to each answer more thoroughly soon, and give the sample code a kick in the tires. From what I've read so far, my original suspicions are confirmed that there is no "simple Python idiom" to do this. Rather, by one way or another, I can't escape keeping all yielded values of iter1 and iter2 around indefinitely.

FWIW: here's an official "test case" if you want to try your solutions.

import operator

def powers_of_ten():
    n = 0
    while True:
        yield 10**n
        n += 1

def series_of_nines():
    yield 1
    n = 1
    while True:
        yield int("9"*n)
        n += 1

op = operator.mul
iter1 = powers_of_ten()
iter2 = series_of_nines()

# given (iter1, iter2, op), create an iterator that yields:
# [1, 9, 10, 90, 99, 100, 900, 990, 999, 1000, 9000, 9900, 9990, 9999, 10000, ...]
like image 636
Dan H Avatar asked May 09 '11 14:05

Dan H


1 Answers

import heapq
import itertools
import operator


def increasing(fn, left, right):
    """
    Given two never decreasing iterators produce another iterator
    resulting from passing the value from left and right to fn.
    This iterator should also be never decreasing.
    """
    # Imagine an infinite 2D-grid.
    # Each column corresponds to an entry from right
    # Each row corresponds to an entry from left
    # Each cell correspond to apply fn to those two values

    # If the number of columns were finite, then we could easily solve
    # this problem by keeping track of our current position in each column
    # in each iteration, we'd take the smallest value report it, and then
    # move down in that column. This works because the values must increase
    # as we move down the column. That means the current set of values
    # under consideration must include the lowest value not yet reported

    # To extend this to infinite columns, at any point we always track a finite
    # number of columns. The last column current tracked is always in the top row
    # if it moves down from the top row, we add a new column which starts at the top row
    # because the values are increasing as we move to the right, we know that
    # this last column is always lower then any columns that come after it





    # Due to infinities, we need to keep track of all
    # items we've ever seen. So we put them in this list
    # The list contains the first part of the incoming iterators that
    # we have explored
    left_items = [next(left)]
    right_items = [next(right)]

    # we use a heap data structure, it allows us to efficiently
    # find the lowest of all value under consideration
    heap = []

    def add_value(left_index, right_index):
        """
        Add the value result from combining the indexed attributes
        from the two iterators. Assumes that the values have already
        been copied into the lists
        """
        value = fn( left_items[left_index], right_items[right_index] )
        # the value on the heap has the index and value.
        # since the value is first, low values will be "first" on the heap
        heapq.heappush( heap, (value, left_index, right_index) )

    # we know that every other value must be larger then 
    # this one. 
    add_value(0,0)

    # I assume the incoming iterators are infinite
    while True:
        # fetch the lowest of all values under consideration
        value, left_index, right_index = heapq.heappop(heap)

        # produce it
        yield value

        # add moving down the column
        if left_index + 1 == len(left_items):
            left_items.append(next(left))

        add_value(left_index+1, right_index)

        # if this was the first row in this column, add another column
        if left_index == 0:
            right_items.append( next(right) )
            add_value(0, right_index+1)






def fib():
    a = 1
    b = 1
    while True:
        yield a
        a,b = b,a+b



r = increasing(operator.add, fib(), itertools.count() )
for x in range(100):
    print next(r)
like image 139
Winston Ewert Avatar answered Oct 16 '22 19:10

Winston Ewert