Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Intercepting heapq

Tags:

python

heap

I want to use Python's heapq module. However, I need to keep track of which index every value is set to.

So I wrote

class heap(list):       
    def __init__(self,xs):
        super(heap,self).__init__(xs)
        self._index_table = {x:i for i,x in enumerate(self)}

    def __setitem__(self,i,v):
        print(i,v)
        super(heap,self).__setitem__(i,v)
        self._index_table[v] = i

    def append(self,x):
        super(heap,self).append(x)
        self._index_table[x] = len(self)-1

from heapq import heapify, heappush, heappop, _siftdown, _siftup

h = heap([4,3,2,1])
heapify(h)
heappush(h,12)
print(h)
print(h._index_table)

And this prints

[1, 3, 2, 4, 12]
{1: 3, 2: 2, 3: 1, 4: 0}

heapify and heappush have modified the entries in my list circumventing my attempts to catch all the assignments.

Why is this happening? Is there a way around this? Is there still a way I can use the heapq module and still keep track of which index each value corresponds to?

EDIT:

Looks like my code had a heapify(h) line I didn't have in my original post. Fixed this.

like image 309
math4tots Avatar asked May 11 '14 23:05

math4tots


People also ask

What is Heapq used for?

The Python heapq module is part of the standard library. It implements all the low-level heap operations as well as some high-level common uses for heaps. A priority queue is a powerful tool that can solve problems as varied as writing an email scheduler, finding the shortest path on a map, or merging log files.

Is Heapq same as priority queue?

This module provides an implementation of the heap queue algorithm, also known as the priority queue algorithm.

Is Heapq Min or Max?

8 Common Data Structures every Programmer must know The heapq module of python implements the heap queue algorithm. It uses the min heap where the key of the parent is less than or equal to those of its children.

What does Heapq merge do?

merge(*iterables) : Merge multiple sorted inputs into a single sorted output (for example, merge timestamped entries from multiple log files). Returns an iterator over the sorted values.


1 Answers

While reading heapq source's code, I noticed as @math4tots did that it was importing a C implementation. So I ran the following to prove whether it is using the python source (which would call methods from list that are overloadable), or it's using the C implementation that uses compiled methods for lists:

>>> class heap(list):        
...     def __init__(self,xs):
...         super(heap,self).__init__(xs)
...         self._index_table = {x:i for i,x in enumerate(self)}
...     
...     def __setitem__(self,i,v):
...         print("SETITEM")
...         print(i,v)
...         super(heap,self).__setitem__(i,v)
...         self._index_table[v] = i
...     
...     def append(self,x):
...         print("APPEND")
...         super(heap,self).append(x)
...         self._index_table[x] = len(self)-1
... 
>>> 
>>> 
>>> h = heap([4,3,2,1])
>>> heapify(h)
>>> h
[1, 3, 2, 4]
>>> h._index_table
{1: 3, 2: 2, 3: 1, 4: 0}
>>> heappush(h,42)
>>> h
[1, 3, 2, 4, 42]
>>> h._index_table
{1: 3, 2: 2, 3: 1, 4: 0}

it does not printout a single string… which means it does not use the python sources we were looking at, but definitely the compiled version.

So your code is unlikely to work as is…

Reading the C source code of the heapq module proves us right: the _siftup function is using PyList_SET_ITEM() to get values from the list overriding any attempt to overload the method.

Though, all hope is not lost, reading the C source code further on, made me discover that actually the C module does not export the _sitf* functions implementing the heapq algorithms. So I've looked at the following, as a double check:

>>> heapq.heapify
<built-in function heapify>
>>> heapq._siftup
<function _siftup at 0x10b36ab00>

Which proved me right!

So, you can always reimplement the heapify() and heappush() functions which are about a couple of lines long, by using the _siftup() and _siftdown() functions from heapq module that are not shadowed by the C code.

so here would be a take at reimplementing it:

import heapq

class HeapQueue(list):
    def __init__(self,xs):
        super(HeapQueue,self).__init__(xs)
        self._index_table = {x:i for i,x in enumerate(self)}
    def __setitem__(self,i,v):
        super(HeapQueue,self).__setitem__(i,v)
        self._index_table[v] = i
    def append(self,x):
        super(HeapQueue,self).append(x)
        self._index_table[x] = len(self)-1
    def push(self, x):
        self.append(x)
        heapq._siftdown(self, 0, len(self)-1)
    def heapify(self):
        n = len(self)
        for i in reversed(range(n//2)):
            heapq._siftup(self, i)

results:

>>> h = HeapQueue([4,3,2,1])
>>> print(h._index_table)
{1: 3, 2: 2, 3: 1, 4: 0}
>>> h.heapify()
>>> print(h)
[1, 3, 2, 4]
>>> print(h._index_table)
{1: 0, 2: 2, 3: 1, 4: 3}
>>> h.push(42)
>>> print(h._index_table)
{1: 0, 2: 2, 3: 1, 4: 3, 42: 4}
>>> print(h)
[1, 3, 2, 4, 42]
>>> 

my guess is that you wouldn't want to keep that heapify() method, but instead make it as part of the constructor, and think through a nicer interface for your own Heap Queue class. I've only did it as proof of concept for that idea.

HTH

like image 109
zmo Avatar answered Nov 15 '22 04:11

zmo