That's O(t) , but is insignificant if t is much smaller than n . Then all the remaining elements are added to this "little heap" via heappushpop , one at a time.
The heapq is faster than sorted in case if you need to add elements on the fly i.e. additions and insertions could come in unspecified order. Adding new element preserving inner order in any heap is faster than resorting array after each insertion.
The heapq module of python implements the heap queue algorithm. It uses the min heap where the key of the parent is less than or equal to those of its children.
This module provides an implementation of the heap queue algorithm, also known as the priority queue algorithm. Heaps are binary trees for which every parent node has a value less than or equal to any of its children.
heapq
is a binary heap, with O(log n) push
and O(log n) pop
. See the heapq source code.
The algorithm you show takes O(n log n) to push all the items onto the heap, and then O((n-k) log n) to find the kth largest element. So the complexity would be O(n log n). It also requires O(n) extra space.
You can do this in O(n log k), using O(k) extra space by modifying the algorithm slightly. I'm not a Python programmer, so you'll have to translate the pseudocode:
# create a new min-heap
# push the first k nums onto the heap
for the rest of the nums:
if num > heap.peek()
heap.pop()
heap.push(num)
# at this point, the k largest items are on the heap.
# The kth largest is the root:
return heap.pop()
The key here is that the heap contains just the largest items seen so far. If an item is smaller than the kth largest seen so far, it's never put onto the heap. The worst case is O(n log k).
Actually, heapq
has a heapreplace
method, so you could replace this:
if num > heap.peek()
heap.pop()
heap.push(num)
with
if num > heap.peek()
heap.replace(num)
Also, an alternative to pushing the first k
items is to create a list of the first k
items and call heapify
. A more optimized (but still O(n log k)) algorithm is:
# create array of first `k` items
heap = heapify(array)
for remaining nums
if (num > heap.peek())
heap.replace(num)
return heap.pop()
You could also call heapify
on the entire array, then pop the first n-k
items, and then take the top:
heapify(nums)
for i = 0 to n-k
heapq.heappop(nums)
return heapq.heappop(nums)
That's simpler. Not sure if it's faster than my previous suggestion, but it modifies the original array. The complexity is O(n) to build the heap, then O((n-k) log n) for the pops. So it's be O((n-k) log n). Worst case O(n log n).
heapify() actually takes linear time because the approach is different than calling heapq.push() N times.
heapq.push()/heapq.pop() takes log n time because it adjust all the nodes at a given hight/level.
when you pass an array in heapify() it makes sure that the left and right children of the node are already maintaining the heap property whether it is a min heap or max heap.
you can see this video: https://www.youtube.com/watch?v=HqPJF2L5h9U
https://www.youtube.com/watch?v=B7hVxCmfPtM
Hope this would help.
Summarize from @Shivam purbia 's post:
heaps.heapify()
can reduce both time and space complexity because heaps.heapify()
is an in-place heapify and costs linear time to run it.heapq.heappush()
and heapq.heappop()
cost O(logN) time complexityFinal code will be like this ...
import heapq
def findKthLargest(self, nums, k):
heaps.heapify(nums) # in-place heapify -> cost O(N) time
for _ in range(len(nums)-k): # run (N-k) times
heapq.heappop(heap) # cost O(logN) time
return heapq.heappop(heap)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With