I need to compare two lists which are basically list-of-list find out the sublists which are present in one list but not other. Also the arrangement of the sublists does not consider i.e. ['a','b'] = ['b,'a']. The two lists are
List_1 = [['T_1','T_2'],['T_2','T_3'],['T_1','T_3']]
List_2 = [['T_1','T_2'],['T_3','T_1']]
The output list should be
out_list = [['T_2','T_3']]
For two element sublists, this should suffice:
[x for x in List_1 if x not in List_2 and x[::-1] not in List_2]
Code:
List_1 = [['T_1','T_2'],['T_2','T_3'],['T_1','T_3']]
List_2 = [['T_1','T_2'],['T_3','T_1']]
print([x for x in List_1 if x not in List_2 and x[::-1] not in List_2])
Here's a little messy functional solution that uses set
s and tuple
s in the process (set
s are used because what you're trying to calculate is the symmetric difference, and tuple
s are used because unlike list
s, they're hashable, and can be used as set
elements):
List_1 = [['T_1','T_2'],['T_2','T_3'],['T_1','T_3']]
List_2 = [['T_1','T_2'],['T_3','T_1']]
f = lambda l : tuple(sorted(l))
out_list = list(map(list, set(map(f, List_1)).symmetric_difference(map(f, List_2))))
print(out_list)
Output:
[['T_2', 'T_3']]
I'd say frozensets are more appropiate for such task:
fs2 = set(map(frozenset,List_2))
out = set(map(frozenset,List_1)).symmetric_difference(fs2)
print(out)
# {frozenset({'T_2', 'T_3'})}
The advantage of using frozensets
here is that they can be hashed, hence you can simply map both lists and take the set.symmetric_difference
.
If you want a nested list from the output, you can simply do:
list(map(list, out))
Note that some sublists might appear in a different order, though given the task should not be a problem
You can convert lists to sets for equality comparison and use any()
to add into list only items which doesn't exists in second list:
List_1 = [['T_1', 'T_2'], ['T_2', 'T_3'], ['T_1', 'T_3']]
List_2 = [['T_1', 'T_2'], ['T_3', 'T_1']]
out_list = [l1 for l1 in List_1 if not any(set(l1) == set(l2) for l2 in List_2)]
For better understanding resources consumption and efficiency of each answer I've done some tests. Hope it'll help to choose best.
Results on data from question:
Results on bigger data:
if you do not have duplicates in your lists you can use:
set(frozenset(e) for e in List_1).symmetric_difference({frozenset(e) for e in List_2})
output:
{frozenset({'T_2', 'T_3'}), frozenset({1, 2})}
if you need a list of lists as output you can use:
[list(o) for o in output]
ouptut:
[['T_2', 'T_3']]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With