let me start with some background.
Let's say I have this list:
interactions = [ ['O1', 'O3'],
['O2', 'O5'],
['O8', 'O10']
['P3', 'P5'],
['P2', 'P19'],
['P1', 'P6'] ]
Each entry in the list (eg: O1, O3
) is an interaction between two entities (although everything we're dealing with here are Strings). There are many different entities in the list.
We also have the following list:
similar = ['O1', 'P23'],
['O3', 'P50'],
['P2', 'O40'],
['P19', 'O22']
In which each entry is a relationship of similarity between two different entities.
So O1 is similar to P23 and O3 is similar to P50 AND [O1, O3] interact thus the interaction ['P23', 'P50'] is a transformed interaction.
Likewise, P2 is similar to O40 and P19 is similar to O22 AND [P2, P19] interact thus the interaction ['O40', 'O22'] is a transformed interaction.
The transformed interactions will always be from the same type, eg: [PX, PX] or [OX, OX].
So I wrote the following code to generate these relationship transfers:
from collections import defaultdict
interactions = [ ['O1', 'O3'],
['O2', 'O5'],
['O8', 'O10']
['P3', 'P5'],
['P2', 'P19'],
['P1', 'P6'] ]
similar = [ ['O1', 'H33'],
['O6', 'O9'],
['O4', 'H1'],
['O2', 'H12'] ]
def list_of_lists_to_dict(list_of_lists):
d = defaultdict(list)
for sublist in list_of_lists:
d[sublist[0]].append(sublist[1])
d[sublist[1]].append(sublist[0])
return d
interactions_dict = list_of_lists_to_dict(interactions)
similar_dict = list_of_lists_to_dict(similar)
for key, values in interactions_dict.items():
print "{0} interacts with: {1}".format(key, ', '.join(values))
if key in similar_dict:
print " {0} is similar to: {1}".format(key, ', '.join(similar_dict[key]))
forward = True
for value in values:
if value in similar_dict:
print " {0} is similar to: {1}".format(value, ', '.join(similar_dict[value]))
reverse = True
if forward and reverse:
print " thus [{0}, {1}] interact!".format(', '.join(similar_dict[key]),
', '.join(similar_dict[value]))
forward = reverse = False
My attempt does generate the correct output, but it also generated unwanted output. For example, sometimes it will generate output between different types of entities: O1, P1
, and between the exact same entities: O1, O1
. It also also outputs duplicate results in different forms, eg: O1, P1
, P1, O1
- both mean the same thing so we only want this entry once. All of this is unwanted behaviour.
So my question is, how can I restructure my attempt to solve this problem?
Thanks.
If the similarity relationship is neither symmetric nor transitive:
from collections import defaultdict
from itertools import product
# entity -> similar entities
d = defaultdict(list) # use `set` if `similar` has duplicate entries
for k, v in similar:
d[k].append(v)
for a, b in interactions:
for x, y in product(d[a], d[b]):
# a, b interact; a is similar to x, b is similar to y
#note: filter undesired x, y interactions here
print x, y # transformed interaction
I have some recommendations for the overall algorithm:
Some of these problems are addressed by J.F. Sebastian's answer, but I think you should pay attention to how the original dictionary is constructed, that will make it so much easier to come up with results that make sense.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With