Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python heapq vs. sorted complexity and performance

I'm relatively new to python (using v3.x syntax) and would appreciate notes regarding complexity and performance of heapq vs. sorted.

I've already implemented a heapq based solution for a greedy 'find the best job schedule' algorithm. But then I've learned about the possibility of using 'sorted' together with operator.itemgetter() and reverse=True.

Sadly, I could not find any explanation on expected complexity and/or performance of 'sorted' vs. heapq.

like image 344
ofer.sheffer Avatar asked Jul 10 '14 02:07

ofer.sheffer


People also ask

Is Heapify faster than sort?

The first step in heap sort is to build a min or max heap from the array data and then delete the root element recursively and heapify the heap until there is only one node present in the heap. Heapsort is an efficient algorithm and it performs faster than selection sort.

Is Heapq sorted in python?

The heapq implements a min-heap sort algorithm suitable for use with Python's lists. A heap is a tree-like data structure where the child nodes have a sort-order relationship with the parents.

What is the time complexity of Heapify python?

heapify only takes O(N) time.

What is the time complexity of Heappop?

To delete this root, all heap implementations have a O(log(n)) time complexity. For example the python heapq module implements a heap with an array, and all the time the first element of the array is the root of the heap.


2 Answers

If you use binary heap to pop all elements in order, the thing you do is basically heapsort. It is slower than sort algorightm in sorted function apart from it's implementation is pure python.

The heapq is faster than sorted in case if you need to add elements on the fly i.e. additions and insertions could come in unspecified order. Adding new element preserving inner order in any heap is faster than resorting array after each insertion.

The sorted is faster if you will need to retrieve all elements in order later.

The only problem where they can compete - if you need some portion of smallest (or largest) elements from collection. Although there are special algorigthms for that case, whether heapq or sorted will be faster here depends on the size of the initial array and portion you'll need to extract.

like image 88
Odomontois Avatar answered Oct 05 '22 03:10

Odomontois


The nlargest() and nsmallest() functions of heapq are most appropriate if you are trying to find a relatively small number of items. If you want to find simply single smallest or largest number , min() and max() are most suitable, because it's faster and uses sorted and then slicing. If you are looking for the N smallest or largest items and N is small compared to the overall size of the collection, these functions provide superior performance. Although it's not necessary to use heapq in your code, it's just an interesting topic and a worthwhile subject of study.

like image 24
vikas0713 Avatar answered Oct 05 '22 03:10

vikas0713