Grokking Timsort

Tags:

There's a (relatively) new sort on the block called Timsort. It's been used as Python's list.sort, and is now going to be the new Array.sort in Java 7.

There's some documentation and a tiny Wikipedia article describing the high-level properties of the sort and some low-level performance evaluations, but I was curious if anybody can provide some pseudocode to illustrate what Timsort is doing, exactly, and what are the key things that make it zippy. (Esp. with regard to the cited paper, "Optimistic Sorting and Information Theoretic Complexity.")

(See also related StackOverflow post.)

534

asked Nov 14 '09 02:11

Yang

1 Answers

Quoting the relevant portion from a now deleted blog post: Visualising Sorting Algorithms: Python's timsort

The business-end of timsort is a mergesort that operates on runs of pre-sorted elements. A minimum run length minrun is chosen to make sure the final merges are as balanced as possible - for 64 elements, minrun happens to be 32. Before the merges begin, a single pass is made through the data to detect pre-existing runs of sorted elements. Descending runs are handled by simply reversing them in place. If the resultant run length is less than minrun, it is boosted to minrun using insertion sort. On a shuffled array with no significant pre-existing runs, this process looks exactly like our guess above: pre-sorting blocks of minrun elements using insertion sort, before merging with merge sort.

[...]

timsort finds a descending run, and reverses the run in-place. This is done directly on the array of pointers, so seems "instant" from our vantage point.
The run is now boosted to length minrun using insertion sort.
No run is detected at the beginning of the next block, and insertion sort is used to sort the entire block. Note that the sorted elements at the bottom of this block are not treated specially - timsort doesn't detect runs that start in the middle of blocks being boosted to minrun.
Finally, mergesort is used to merge the runs.

answered Oct 12 '22 13:10

u0b34a0f6ae

Related questions
                            
                                C# - do I need manifest files?
                            
                                Are there any tools for schema migration for NoSQL databases? [closed]
                            
                                How to do while loops with multiple conditions
                            
                                What hash algorithms are parallelizable? Optimizing the hashing of large files utilizing on multi-core CPUs
                            
                                How to enable Code Analysis in Visual Studio 2010 Professional?
                            
                                Jump to Documentation for a function/class in Visual Studio
                            
                                What is the purpose of the PermissionSet attribute in the MSDN FileSystemWatcher class example?
                            
                                How do you pass a BitmapImage from a background thread to the UI thread in WPF?
                            
                                Any good debugger for HTML5 Javascript postMessage API? [closed]
                            
                                How do you manage multiple versions of the same software for each customer?
                            
                                JavaScript: Decimal Values
                            
                                What is the advantage of using Java Beans?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With