Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge lists that share common elements

My input is a list of lists. Some of them share common elements, eg.

L = [['a','b','c'],['b','d','e'],['k'],['o','p'],['e','f'],['p','a'],['d','g']] 

I need to merge all lists, that share a common element, and repeat this procedure as long as there are no more lists with the same item. I thought about using boolean operations and a while loop, but couldn't come up with a good solution.

The final result should be:

L = [['a','b','c','d','e','f','g','o','p'],['k']]  
like image 651
WlJs Avatar asked Jan 30 '11 11:01

WlJs


People also ask

How do you merge elements in a list python?

The most conventional method to concatenate lists in python is by using the concatenation operator(+). The “+” operator can easily join the whole list behind another list and provide you with the new list as the final output as shown in the below example.

How do I concatenate a list of lists?

📢 TLDR: Use + In almost all simple situations, using list1 + list2 is the way you want to concatenate lists. The edge cases below are better in some situations, but + is generally the best choice.


1 Answers

You can see your list as a notation for a Graph, ie ['a','b','c'] is a graph with 3 nodes connected to each other. The problem you are trying to solve is finding connected components in this graph.

You can use NetworkX for this, which has the advantage that it's pretty much guaranteed to be correct:

l = [['a','b','c'],['b','d','e'],['k'],['o','p'],['e','f'],['p','a'],['d','g']]  import networkx  from networkx.algorithms.components.connected import connected_components   def to_graph(l):     G = networkx.Graph()     for part in l:         # each sublist is a bunch of nodes         G.add_nodes_from(part)         # it also imlies a number of edges:         G.add_edges_from(to_edges(part))     return G  def to_edges(l):     """          treat `l` as a Graph and returns it's edges          to_edges(['a','b','c','d']) -> [(a,b), (b,c),(c,d)]     """     it = iter(l)     last = next(it)      for current in it:         yield last, current         last = current      G = to_graph(l) print connected_components(G) # prints [['a', 'c', 'b', 'e', 'd', 'g', 'f', 'o', 'p'], ['k']] 

To solve this efficiently yourself you have to convert the list into something graph-ish anyways, so you might as well use networkX from the start.

like image 144
Jochen Ritzel Avatar answered Oct 23 '22 01:10

Jochen Ritzel