I want to use Python's heapq
module. However, I need to keep track of which index every value is set to.
So I wrote
class heap(list):
def __init__(self,xs):
super(heap,self).__init__(xs)
self._index_table = {x:i for i,x in enumerate(self)}
def __setitem__(self,i,v):
print(i,v)
super(heap,self).__setitem__(i,v)
self._index_table[v] = i
def append(self,x):
super(heap,self).append(x)
self._index_table[x] = len(self)-1
from heapq import heapify, heappush, heappop, _siftdown, _siftup
h = heap([4,3,2,1])
heapify(h)
heappush(h,12)
print(h)
print(h._index_table)
And this prints
[1, 3, 2, 4, 12]
{1: 3, 2: 2, 3: 1, 4: 0}
heapify
and heappush
have modified the entries in my list circumventing my attempts to catch all the assignments.
Why is this happening? Is there a way around this? Is there still a way I can use the heapq
module and still keep track of which index each value corresponds to?
EDIT:
Looks like my code had a heapify(h)
line I didn't have in my original post. Fixed this.
The Python heapq module is part of the standard library. It implements all the low-level heap operations as well as some high-level common uses for heaps. A priority queue is a powerful tool that can solve problems as varied as writing an email scheduler, finding the shortest path on a map, or merging log files.
This module provides an implementation of the heap queue algorithm, also known as the priority queue algorithm.
8 Common Data Structures every Programmer must know The heapq module of python implements the heap queue algorithm. It uses the min heap where the key of the parent is less than or equal to those of its children.
merge(*iterables) : Merge multiple sorted inputs into a single sorted output (for example, merge timestamped entries from multiple log files). Returns an iterator over the sorted values.
While reading heapq
source's code, I noticed as @math4tots did that it was importing a C implementation. So I ran the following to prove whether it is using the python source (which would call methods from list
that are overloadable), or it's using the C implementation that uses compiled methods for lists:
>>> class heap(list):
... def __init__(self,xs):
... super(heap,self).__init__(xs)
... self._index_table = {x:i for i,x in enumerate(self)}
...
... def __setitem__(self,i,v):
... print("SETITEM")
... print(i,v)
... super(heap,self).__setitem__(i,v)
... self._index_table[v] = i
...
... def append(self,x):
... print("APPEND")
... super(heap,self).append(x)
... self._index_table[x] = len(self)-1
...
>>>
>>>
>>> h = heap([4,3,2,1])
>>> heapify(h)
>>> h
[1, 3, 2, 4]
>>> h._index_table
{1: 3, 2: 2, 3: 1, 4: 0}
>>> heappush(h,42)
>>> h
[1, 3, 2, 4, 42]
>>> h._index_table
{1: 3, 2: 2, 3: 1, 4: 0}
it does not printout a single string… which means it does not use the python sources we were looking at, but definitely the compiled version.
So your code is unlikely to work as is…
Reading the C source code of the heapq module
proves us right: the _siftup
function is using PyList_SET_ITEM()
to get values from the list overriding any attempt to overload the method.
Though, all hope is not lost, reading the C source code further on, made me discover that actually the C module does not export the _sitf*
functions implementing the heapq algorithms. So I've looked at the following, as a double check:
>>> heapq.heapify
<built-in function heapify>
>>> heapq._siftup
<function _siftup at 0x10b36ab00>
Which proved me right!
So, you can always reimplement the heapify()
and heappush()
functions which are about a couple of lines long, by using the _siftup()
and _siftdown()
functions from heapq module that are not shadowed by the C code.
so here would be a take at reimplementing it:
import heapq
class HeapQueue(list):
def __init__(self,xs):
super(HeapQueue,self).__init__(xs)
self._index_table = {x:i for i,x in enumerate(self)}
def __setitem__(self,i,v):
super(HeapQueue,self).__setitem__(i,v)
self._index_table[v] = i
def append(self,x):
super(HeapQueue,self).append(x)
self._index_table[x] = len(self)-1
def push(self, x):
self.append(x)
heapq._siftdown(self, 0, len(self)-1)
def heapify(self):
n = len(self)
for i in reversed(range(n//2)):
heapq._siftup(self, i)
results:
>>> h = HeapQueue([4,3,2,1])
>>> print(h._index_table)
{1: 3, 2: 2, 3: 1, 4: 0}
>>> h.heapify()
>>> print(h)
[1, 3, 2, 4]
>>> print(h._index_table)
{1: 0, 2: 2, 3: 1, 4: 3}
>>> h.push(42)
>>> print(h._index_table)
{1: 0, 2: 2, 3: 1, 4: 3, 42: 4}
>>> print(h)
[1, 3, 2, 4, 42]
>>>
my guess is that you wouldn't want to keep that heapify()
method, but instead make it as part of the constructor, and think through a nicer interface for your own Heap Queue class. I've only did it as proof of concept for that idea.
HTH
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With