Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Traversing heapified list

Tags:

python

heap

I'm making a Monte-Carlo simulation. And as a part of this task I generate samples uniformly distributed over an interval (0,100).

generate = lambda: uniform(0,100)

The iterations stop when all the closest generated points' pairs meet the criteria.

check = lambda a,b: True if (b-a)<5 else False

I need to have some structure to effectively keep all the generated points and be able to go through them in ascending order to perform check on all the subsequent pairs.

There is a heapq module in Python which supports a very effective heap structure. And I decided to use it.

I faced a problem. I have found no traversal procedure supported by this module. The only way I found to access the values of the heap in ascending order is to use heapq.heappop. But it deletes the values from the heap.

I found the workaround for this and just copied the heap object into the new one and iterated with heappop over the new one. But I don't think it's quite effective to copy the whole structure in memory one every iteration.

Is there any other way I can go to do what I'm trying to do more effectively?


The simplified code for illustration.

import heapq
from random import uniform
from itertools import tee, izip, count
from copy import copy


def pairwise(iterable): #get values from iterator in pairs
    a, b = tee(iterable)
    next(b, None)
    return izip(a, b)


check = lambda a,b: True if (b-a)<5 else False
generate = lambda: uniform(0,100)


def iterate_heap(heap):
    heap = copy(heap) #Here I have to copy the heap to be able to traverse
    try:
        while True:
            yield heapq.heappop(heap)
    except IndexError:
        return


def trial():
    items = []

    for i in count():
        item = generate()
        heapq.heappush(items, item)

        it = iterate_heap(items)
        it = pairwise(it)

        if i>0 and all(check(a,b) for a,b in it): #if i==0 then 'it' returns no values and 'all' returns True
            return i

print "The solution is reached. It took %d iterations." % trial()

paiwise function is from recipe from here.


Update: In this implementation with heappop the complexity on each iteration is O(n*log(n)):

Copying heap: O(n)

Adding a new value to the heap: O(log(n))

Traversing: n elements * O(log(n)) on popping each value from heap -> O(n*log(n)).

Result: O(n+log(n)+n*log(n)) = O(n*log(n)

But I expect the traversal to be O(n), so the resultant complexity would be O(n).

By the way, if we use just sorted list, we would need to sort the list on each adding, so O(n*log(n)), but the traversal would be n*O(1) -> O(n). So, the resultant complexity is still O(n*log(n)).

I have found a solution. It's to use bisect module. Finding the place to add would be O(log(n)). Adding to the list is of O(n) (because of the implementation all the values after the insertion in place have to be moved). Traversing is O(n). So, the resultant complexity is O(n).

Still, I wounder, if there is a way to solve this task using heaps in Python.

like image 741
ovgolovin Avatar asked Dec 09 '22 05:12

ovgolovin


2 Answers

I would use list.sort() on the heap. That leaves the heap condition intact and makes it possible to iterate over the underlying list directly.

FWIW, the Timsort algorithm used by list.sort will take advantage of the partial ordering that already exists in the heap.

like image 85
Raymond Hettinger Avatar answered Dec 23 '22 20:12

Raymond Hettinger


From the python docs:

These two make it possible to view the heap as a regular Python list without surprises: heap[0] is the smallest item, and heap.sort() maintains the heap invariant!

Is there a reason you can't just treat the heap as a list and iterate over it?

like image 27
ObscureRobot Avatar answered Dec 23 '22 20:12

ObscureRobot