Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to find objects not present in both lists

I am working on a module that depends on checking if there are any objects not present in either of the 2 lists. The implementation is supposed to be in Python.

Consider the simplified object def:

class Foo(object):

  def __init__(self, attr_one=None, attr_two=None):
    self.attr_one = attr_one
    self.attr_two = attr_two

  def __eq__(self, other):
    return self.attr_one == other.attr_one and self.attr_two == other.attr_two

I have two separate lists that can encapsulates multiple instances of class Foo as follows:

list1 = [Foo('abc', 2), Foo('bcd', 3), Foo('cde', 4)]
list2 = [Foo('abc', 2), Foo('bcd', 4), Foo('efg', 5)]

I need to figure out the objects that are present in one list and absent in the other on the basis of attr_one. In this case, the desired output for items present in first list and missing in the second list is given below.

`['Foo('bcd', 3), Foo('cde', 4)]` 

Similarly, the items present in list 2 but not in list 1

 [Foo('bcd', 4), Foo('efg', 5)]

I would like to know if there is a way to match the basis of attr_one as well.

  List 1                 List 2        
  Foo('bcd', 3)          Foo('bcd', 4)
  Foo('cde', 4)          None
  None                   Foo('efg', 5)
like image 536
Kartik Avatar asked Dec 11 '22 17:12

Kartik


2 Answers

Since you already have an __eq__ method defined, You can use list comprehension to find the uniqueness of the objects in either of the lists.

print [obj for obj in list1 if obj not in list2]
like image 167
GTM Avatar answered Dec 26 '22 16:12

GTM


A good way to quickly compare lists to determine which elements are present in one but not the other is to create sets from the original lists and take the difference between the two sets. In order for the list to be made into a set, the objects it contains must be hashable, so you must define a new __hash__() method for your Foo objects:

def __hash__(self):
    return hash((self.attr_one,self.attr_two))

Note that since tuples are hashable, as long as attr_one and attr_two are hashable types, this implementation should be pretty solid.

Now, to determine which elements are present in one list but not the other:

set1 = set(list1)
set2 = set(list2)
missing_from_1 = set2 - set1
missing_from_2 = set1 - set2

To do this on the basis of only one of the attributes, you can create your sets using only the attributes themselves:

set1 = set([i.attr_one for i in list1])

Of course, this means that you'll end up with results that only tell you the attr_one values that are present in one list but not the other, rather than giving you the actual Foo objects. The objects themselves are easy to find, however, once you have the "missing" sets:

missing_Foos = set()
for attr in missing_from_2:
    for i in list1:
        if i.attr_one == attr:
            missing_Foos.add(i)

This can be rather computationally expensive, though, if you have very long lists.

EDIT: using sets is only really useful if you have extremely large lists and therefore need to take advantage of the computational efficiency of set operations. Otherwise, it may be simpler to simply use list comprehensions, as suggested in the other answer.

like image 30
Kyle Strand Avatar answered Dec 26 '22 16:12

Kyle Strand