Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deleting from python heapq in O(logn)

Tags:

python

I have a heap (python, heapq module) like this -

>>> h = []
>>> heappush(h, (5, 'write code'))
>>> heappush(h, (7, 'release product'))
>>> heappush(h, (1, 'write spec'))
>>> heappush(h, (3, 'create tests'))

How do I remove the tuple with item value as "create tests" in O(logn) and preserve the heap property?

This is what I could come up with (not O(logn))

for i in range(len(h)):
   if h[i][1] == "create tests":
      h[i], h[-1] = h[-1], h[i]
      popped = h.pop()
      heapq.heapify(h)
      break
like image 618
Prakhar Avatar asked Dec 10 '12 12:12

Prakhar


Video Answer


3 Answers

If you do need to take an item out of the heap but want to preserve the heap you could do it lazily and discard it when the item comes out naturally, rather than searching through the list for it.

If you store items you want to remove in a blacklist set, then each time you heapq.heappop check if that item is in the set. If it exists discard it and heappop again until you get something that's not blacklisted, or the heap is empty

like image 160
GP89 Avatar answered Oct 08 '22 10:10

GP89


A blacklist set would be problematic if multiple deleted elements have the same value. Instead implement heap_remove using a tombstone-counting-dict:

def heap_remove(heap, value):
  tombstones[value] = tombstones.get(value, 0) + 1
  while len(heap) and heap[0] in tombstones and tombstones[heap[0]]:
      heappop(heap)

As expected, you have amortized O(1) removal time and the top of your heap is always accurate as long as you are not popping from the heap elsewhere.

Consider this list of numbers, which are first all pushed into the heap and then "removed" (not popped) in the same order:

[3, 3, 2, 7, 1, 4, 2]

Inserts work as expected:

After inserting 3 into heap, top = 3
After inserting 3 into heap, top = 3
After inserting 2 into heap, top = 2
After inserting 7 into heap, top = 2
After inserting 1 into heap, top = 1
After inserting 4 into heap, top = 1
After inserting 2 into heap, top = 1

But removals are done by incrementing the object's tombstone. If the top of the heap has it's tombstone set, then remove the object and decrement the tombstone counter.

Called remove( 3 )
  Marking 3 for deletion
Called remove( 3 )
  Marking 3 for deletion
Called remove( 2 )
  Marking 2 for deletion
Called remove( 7 )
  Marking 7 for deletion
Called remove( 1 )
  Marking 1 for deletion
  Deleting top 1
    Updated heap is: [2, 2, 3, 7, 3, 4]
  Deleting top 1
    Updated heap is: [2, 3, 3, 7, 4]
Called remove( 4 )
  Marking 4 for deletion
Called remove( 2 )
  Marking 2 for deletion
  Deleting top 2
    Updated heap is: [3, 3, 4, 7]
  Deleting top 3
    Updated heap is: [3, 7, 4]
  Deleting top 3
    Updated heap is: [4, 7]
  Deleting top 4
    Updated heap is: [7]
  Deleting top 7
    Updated heap is: []

Notice that when the second heap_remove(3) is called @GP89's solution breaks down as 3 was already in the tombstone set.

You can explore this example here.

like image 6
Jedi Avatar answered Oct 08 '22 10:10

Jedi


With above 2 ideas, here is a complete demo: I'll make it clean and concise soon.

from heapq import heappush, heappop

class Solution:

  def demo():

    deleted = {}
    h = [0]

    heappush(h, 789)
    heappush(h, 101)
    heappush(h, 101)

    self.remove(h, 101, deleted)

    max_val = self.peek(h, deleted)

  def remove(self, h, y, deleted):
    deleted[y] = deleted.get(y, 0) + 1
    while len(h) > 0 and h[0] == y and deleted[y] > 0:
        heappop(h)
        deleted[y] -= 1

  def peek(self, h, deleted):
    while len(h) > 0 and deleted.get(h[0],0) > 0:
        deleted[h[0]] -= 1
        heappop(h)
    return h[0]
like image 2
Charlie 木匠 Avatar answered Oct 08 '22 12:10

Charlie 木匠