Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stable Sorting, ie, Minimally-Disruptive Sorting

Suppose I have a list of things (numbers, to keep things simple here) and I have a function I want to use to sort them by, using SortBy. For example, the following sorts a list of numbers by last digit:

SortBy[{301, 201}, Mod[#,10]&]

And notice how two of (ie, all of) those numbers have the same last digit. So it doesn't matter which order we return them in. In this case Mathematica returns them in the opposite order. How can I ensure that all ties are broken in favor of how the items were ordered in the original list?

(I know it's kind of trivial but I feel like this comes up from time to time so I thought it would be handy to get it on StackOverflow. I'll post whatever I come up with as an answer if no one beats me to it.)

Attempts at making this more searchable: sort with minimal disturbance, sort with least number of swaps, custom tie-breaking, sorting with costly swapping, stable sorting.

PS: Thanks to Nicholas for pointing out that this is called stable sorting. It was on the tip of my tongue! Here's another link: Link

like image 442
dreeves Avatar asked Jul 21 '10 23:07

dreeves


People also ask

What is stable sorting?

What is a stable sorting algorithm? A sorting algorithm is said to be stable if two objects with equal keys appear in the same order in sorted output as they appear in the input data set. Formally stability may be defined as, how the algorithm treats equal elements.

What is the difference between a stable and unstable sorting algorithm?

A stable sorting algorithm maintains the relative order of the items with equal sort keys. An unstable sorting algorithm does not. In other words, when a collection is sorted with a stable sorting algorithm, items with the same sort keys preserve their order after the collection is sorted.

What is stable and inplace sorting?

Stable means the order of input elements is unchanged except where change is required to satisfy the requirements. A stable sort applied to a sequence of equal elements will not change their order. In-place means that the input and output occupy the same memory storage space.

Which sorting algorithm is in-place but not stable?

Heap sort is an in-place algorithm but is not stable.


2 Answers

After asking around, I was given a satisfying explanation:

Short answer: You want SortBy[list, {f}] to get a stable sort.

Long answer:

SortBy[list, f] sorts list in the order determined by applying f to each element of list, breaking ties using the canonical ordering explained under Sort. (This is the second documented "More Information" note in the documentation for SortBy.)

SortBy[list, {f, g}] breaks ties using the order determined by applying g to each element.

Note that SortBy[list, f] is the same as SortBy[list, {f, Identity}].

SortBy[list, {f}] does no tie breaking (and gives a stable sort), which is what you want:

In[13]:= SortBy[{19, 301, 201, 502, 501, 101, 300}, {Mod[#, 10] &}]

Out[13]= {300, 301, 201, 501, 101, 502, 19}

Finally, sakra's solution SortBy[list, {f, tie++ &}] is effectively equivalent to SortBy[list, {f}].

like image 132
Andrew Moylan Avatar answered Oct 03 '22 21:10

Andrew Moylan


Does GatherBy do what you want?

Flatten[GatherBy[{301, 201, 502, 501, 101}, Mod[#, 10] &]]
like image 31
Mark Fisher Avatar answered Oct 03 '22 22:10

Mark Fisher