There's a (relatively) new sort on the block called Timsort. It's been used as Python's list.sort, and is now going to be the new Array.sort in Java 7.
There's some documentation and a tiny Wikipedia article describing the high-level properties of the sort and some low-level performance evaluations, but I was curious if anybody can provide some pseudocode to illustrate what Timsort is doing, exactly, and what are the key things that make it zippy. (Esp. with regard to the cited paper, "Optimistic Sorting and Information Theoretic Complexity.")
(See also related StackOverflow post.)
TimSort is a highly optimized mergesort, it is stable and faster than old mergesort. when comparing with quicksort, it has two advantages: It is unbelievably fast for nearly sorted data sequence (including reverse sorted data); The worst case is still O(N*LOG(N)).
Timsort is one of the best sorting algorithms in terms of complexity and stability. Unlike “bubble” or “insertion” sorting, Timsort is rather new — it was invented in 2002 by Tim Peters (and named after him).
The average case time complexity of Tim sort is O(n log n). Worst Case Complexity - It occurs when the array elements are required to be sorted in reverse order. That means suppose you have to sort the array elements in ascending order, but its elements are in descending order.
Timsort is a stable sorting algorithm (order of elements with same key is kept) and strives to perform balanced merges (a merge thus merges runs of similar sizes). In pursuit of balanced merges, Timsort considers three runs on the top of the stack, X, Y, Z, and maintains the invariants: |Z| > |Y| + |X|
Quoting the relevant portion from a now deleted blog post: Visualising Sorting Algorithms: Python's timsort
The business-end of timsort is a mergesort that operates on runs of pre-sorted elements. A minimum run length minrun is chosen to make sure the final merges are as balanced as possible - for 64 elements, minrun happens to be 32. Before the merges begin, a single pass is made through the data to detect pre-existing runs of sorted elements. Descending runs are handled by simply reversing them in place. If the resultant run length is less than minrun, it is boosted to minrun using insertion sort. On a shuffled array with no significant pre-existing runs, this process looks exactly like our guess above: pre-sorting blocks of minrun elements using insertion sort, before merging with merge sort.
[...]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With