Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging tuples if they have one common element

Tags:

python

Consider the following list:

tuple_list = [('c', 'e'), ('c', 'd'), ('a', 'b'), ('d', 'e')]

How can I achieve this?

new_tuple_list = [('c', 'e', 'd'), ('a', 'b')]

I have tried:

for tuple in tuple_list:
    for tup in tuple_list:
        if tuple[0] == tup[0]:
            new_tup = (tuple[0],tuple[1],tup[1])
            new_tuple_list.append(new_tup)

But it only works if I have the elements of the tuple in a certain order which means it will result in this instead:

new_tuple_list = [('c', 'e', 'd'), ('a', 'b'), ('d', 'e')]
like image 988
Meryem Avatar asked Dec 03 '22 13:12

Meryem


2 Answers

You could consider the tuples as edges in a graph and your goal as finding connected components within the graph. Then you could simply loop over vertices (items in tuples) and for each vertex you haven't visited yet execute DFS to generate a component:

from collections import defaultdict

def dfs(adj_list, visited, vertex, result, key):
    visited.add(vertex)
    result[key].append(vertex)
    for neighbor in adj_list[vertex]:
        if neighbor not in visited:
            dfs(adj_list, visited, neighbor, result, key)

edges = [('c', 'e'), ('c', 'd'), ('a', 'b'), ('d', 'e')]

adj_list = defaultdict(list)
for x, y in edges:
    adj_list[x].append(y)
    adj_list[y].append(x)

result = defaultdict(list)
visited = set()
for vertex in adj_list:
    if vertex not in visited:
        dfs(adj_list, visited, vertex, result, vertex)

print(result.values())

Output:

[['a', 'b'], ['c', 'e', 'd']]

Note that in above both the components and elements within a component are in random order.

like image 76
niemmi Avatar answered Jan 05 '23 06:01

niemmi


If you don't need duplicate values (the ability to preserve ['a', 'a', 'b'], for example), this is a simple and fast way to do what you want via sets:

iset = set([frozenset(s) for s in tuple_list])  # Convert to a set of sets
result = []
while(iset):                  # While there are sets left to process:
    nset = set(iset.pop())      # Pop a new set
    check = len(iset)           # Does iset contain more sets
    while check:                # Until no more sets to check:
        check = False
        for s in iset.copy():       # For each other set:
            if nset.intersection(s):  # if they intersect:
                check = True            # Must recheck previous sets
                iset.remove(s)          # Remove it from remaining sets
                nset.update(s)          # Add it to the current set
    result.append(tuple(nset))  # Convert back to a list of tuples

gives

[('c', 'e', 'd'), ('a', 'b')]
like image 38
TemporalWolf Avatar answered Jan 05 '23 05:01

TemporalWolf