Edit The accepted answer works for sets that satisfy the requirements of a strict partially ordered set, so that a directed acyclic graph can be constructed:
not a < a
: the list does not contain items like ['a','a']
if a < b and b < c then a < c
: the list does not contain items like ['a','b'],['b','c'],['c','a']
if a < b then not b < a
: the list does not contain items like ['a','b'],['b','a']
Take this list of lists:[['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd'], ]
and flatten it to a single list, sorted depending on a value's neighbor(s):
The overall ordering between sublists is consistent, meaning there won't be sublists like these: ['b','c'],['c','b']
.
So the result should be: ['b', 'a', 'c', 'd']
After a (long) while I came up with this ugly mess:
def get_order(_list):
order = _list[0]
for sublist in _list[1:]:
if not sublist:
continue
if len(sublist) == 1:
if sublist[0] not in order:
order.append(sublist[0])
continue
new_order = order.copy()
for index, value in enumerate(sublist):
inserted = False
new_order_index = None
if value in new_order:
new_order_index = new_order.index(value)
new_order.remove(value)
for previous_value in sublist[:index][::-1]:
if previous_value in new_order:
insert_index = new_order.index(previous_value) + 1
print('inserting', value, 'at position', insert_index, 'after', previous_value)
new_order.insert(insert_index, value)
inserted = True
break
if inserted:
continue
for next_value in sublist[index:]:
if next_value in new_order:
insert_index = new_order.index(next_value)
print('inserting', value, 'at position', insert_index, 'before', next_value)
new_order.insert(insert_index, value)
inserted = True
break
if inserted:
continue
if new_order_index is None:
print('appending', value)
new_order.append(value)
else:
print('leaving', value, 'at position', new_order_index)
new_order.insert(new_order_index, value)
order = new_order
return order
if __name__ == '__main__':
test_list = [['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd'], ]
order = get_order(test_list)
#>>> inserting a at position 1 before c
#>>> inserting c at position 2 after a
#>>> inserting d at position 3 after c
#>>> inserting a at position 1 before c
#>>> inserting c at position 2 after a
#>>> inserting b at position 0 before a
#>>> inserting a at position 1 after b
print(order)
#>>> ['b', 'a', 'c', 'd']
It appears to do exactly as expected, but it is far from efficient (or elegant for that matter).
Are there any algorithms that can sort like that?
Or are there some pythonic tricks that would make this more efficient?
Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such that for every directed edge u v, vertex u comes before v in the ordering.
In computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering.
A Total Ordering of a Poset The topological sort is a solution to scheduling problems, and it is built on the two concepts previously discussed: partial ordering and total ordering. At its core, a topological sort, or a linear extension, is a total ordering of a partially ordered set.
Topological sort simply involves running DFS on an entire graph and adding each node to the global ordering of nodes, but only after all of a node's children are visited. This ensures that parent nodes will be ordered before their child nodes, and honors the forward direction of edges in the ordering.
You can create a lookup function that determines if a particular value should be placed before or after another value:
d = [['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd']]
flattened = {i for b in d for i in b}
def _lookup(a, b):
_loc = [i for i in d if a in i and b in i]
return True if not _loc else _loc[0].index(a) < _loc[0].index(b)
class T:
def __init__(self, _val):
self.v = _val
def __lt__(self, _n):
return _lookup(self.v, _n.v)
final_result = [i.v for i in sorted(map(T, flattened))]
Output:
['b', 'a', 'c', 'd']
Using [['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd'], ['a', 'e']]
:
['b', 'a', 'c', 'e', 'd']
The existing answers by nosklo and Ajax1234 both fail on an input of [[1, 3], [3, 5], [5, 2], [2, 4]]
. The attempt in your question fails on an input of [[1, 4], [2, 3], [3, 4], [1, 2]]
.
The correct approach is as described by BowlingHawk95: perform a topological sort on the directed acyclic graph induced by your input list.
We could implement our own topological sort, but it's safer to let an existing graph library handle it. For example, NetworkX:
from itertools import chain, tee
import networkx
import networkx.algorithms
# pairwise recipe from the itertools docs.
def pairwise(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ..."
a, b = tee(iterable)
next(b, None)
return zip(a, b)
def merge_ordering(sublists):
# Make an iterator of graph edges for the new graph. Some edges may be repeated.
# That's fine. NetworkX will ignore duplicates.
edges = chain.from_iterable(map(pairwise, sublists))
graph = networkx.DiGraph(edges)
return list(networkx.algorithms.topological_sort(graph))
This produces correct output for the input in the question, the [[1, 3], [3, 5], [5, 2], [2, 4]]
case the other answers failed on, and the [[1, 4], [2, 3], [3, 4], [1, 2]]
case your attempt failed on:
>>> merge_ordering([[1, 3], [3, 5], [5, 2], [2, 4]])
[1, 3, 5, 2, 4]
>>> merge_ordering([['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd']])
['b', 'a', 'c', 'd']
>>> merge_ordering([[1, 4], [2, 3], [3, 4], [1, 2]])
[1, 2, 3, 4]
We can also write a version that raises an error if the input list doesn't uniquely determine the flattened form:
def merge_ordering_unique(sublists):
# Make an iterator of graph edges for the new graph. Some edges may be repeated.
# That's fine. NetworkX will ignore duplicates.
edges = chain.from_iterable(map(pairwise, sublists))
graph = networkx.DiGraph(edges)
merged = list(networkx.algorithms.topological_sort(graph))
for a, b in pairwise(merged):
if not graph.has_edge(a, b):
raise ValueError('Input has multiple possible topological orderings.')
return merged
Demo:
>>> merge_ordering_unique([['b', 'c'], ['a', 'c'], ['b', 'a'], ['a', 'c', 'd']])
['b', 'a', 'c', 'd']
>>> merge_ordering_unique([[1, 3, 4], [1, 2, 4]])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 11, in merge_ordering_unique
ValueError: Input has multiple possible topological orderings.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With