Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm for ordering a list based on the ordering of another, partial, list

This question is different to other questions on the topic of sorting a list based on the order of another list in the sense that the order list does not contain all the keys used in the list.

Say I have a list [a, b, c, d, e] and my order list [b, d, e].

Now I change my order list to [b, e, d]. Is there a relatively simple algorithm to resort the original list? Let's say it's not important whether the final ordering is [a, b, e, c, d] or [a, b, c, e, d], and the order list will always be a subset of the original list.

Edit: clearing up some questions about the final ordering, from my example: e was ordered to be between b and d, and in the sorted list it doesn't matter if e ends up adjacent to b or d. But for example, if due to this sorting a moved to after b - while a legal ordering - it's not desirable.

like image 471
Niel Avatar asked Aug 05 '15 15:08

Niel


1 Answers

You can accomplish what you want by setting up a custom comparator as Dennis Callanan has suggested and then doing a stable sort of the array. Quicksort is not a stable sort; it will generally change the order of elements not included in the partial ordering. Merge sort is a stable sort.

If you use linear search in the comparator, the algorithm will run in time O(n^2 log n) I think. What you need to do to make it run faster is to make one run through the new array, hashing the position of each element in the array for fast lookups.

I could implement it in Java but I don't know Python, sorry.

Another perspective on this is to use a topological sort. In your original list you had a -> b -> c -> d -> e where the arrows mean a "comes before" b. Then you add the new data b -> e -> d. You've got to break any arrows in the first list that lead to contradictions, i.e., d -> e. What you then have is a bunch of arrows:

a -> b, b -> c, c -> d, b -> e, e -> d

If that is a directed acyclic graph (i.e., no contradictions), you can then toposort it in O(V + E) time. Since the number of edges is at most 2n, that's O(n) time, very efficient.

The issue is deciding what arrows to break (and maybe replace with other arrows) in the original list so that there aren't any contradictions. In general that's an NP-hard problem called minimum feedback arc set but I suspect there's something in the structure of your problem that would make it run faster.

Finally, what about just replacing the elements in the (non-contiguous) subarray [ ..., b, ..., d, e] with the permutation given by the new array? That can be accomplished in O(n) time.

like image 152
Edward Doolittle Avatar answered Nov 15 '22 11:11

Edward Doolittle