Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is insertion of heapq is faster than insertion of bisect?

I have a question about bisect and heapq.

First I will show you 2 versions of code and then ask question about it.

version of using bisect:

while len(scoville) > 1:
    a = scoville.pop(0)
    #pops out smallest unit
    if a >= K:
        break
    b = scoville.pop(0)
    #pops out smallest unit
    c = a + b * 2
    bisect.insort(scoville, c)

version of using heapq

while len(scoville) > 1:
    a = heapq.heappop(scoville)
    #pops out smallest unit
    if a >= K:
        break
    b = heapq.heappop(scoville)
    #pops out smallest unit
    c = a + b * 2
    heapq.heappush(scoville, c)

Both algorithms use 2 pops and 1 insert.

As I know, at version of using bisect, pop operation of list is O(1), and insertion operation of bisect class is O(logn).

And at version of using heapq, pop operation of heap is O(1), and insertion operation of heap is O(logn) in average.

So both code should have same time efficiency approximately. However, version of using bisect is keep failing time efficiency test at some code challenge site.

Does anybody have a good guess?

*scoville is a list of integers

like image 245
jinwook han Avatar asked Jan 27 '23 06:01

jinwook han


1 Answers

Your assumptions are wrong. Neither is pop(0) O(1), nor is bisect.insort O(logn).

The problem is that in both cases, all the elements after the element you pop or insert have to be shifted one position to the left or might, making both operations O(n).

From the bisect.insort documentation:

bisect.insort_left(a, x, lo=0, hi=len(a))

Insert x in a in sorted order. This is equivalent to a.insert(bisect.bisect_left(a, x, lo, hi), x) assuming that a is already sorted. Keep in mind that the O(log n) search is dominated by the slow O(n) insertion step.

You can test this by creating a really long list, say l = list(range(10**8)) and then doing l.pop(0) or l.pop() and bisect.insort(l, 0) or bisect.insort(l, 10**9). The operations popping and inserting at the end shoul be instantaneous, while the others have a short but noticeable delay. You can also use %timeit to test it repeatedly on shorter lists, if you alternatingly pop and insert so that the length of the list remains constant over many thousands of runs:

>>> l = list(range(10**6))
>>> %timeit l.pop(); bisect.insort(l, 10**6)
100000 loops, best of 3: 2.21 us per loop
>>> %timeit l.pop(0); bisect.insort(l, 0)
100 loops, best of 3: 14.2 ms per loop

Thus, the version using bisect is O(n) and the one with heapq is O(logn).

like image 139
tobias_k Avatar answered Jan 29 '23 19:01

tobias_k