Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

in python: difference between two lists

I have two lists like so

found = ['CG', 'E6', 'E1', 'E2', 'E4', 'L2', 'E7', 'E5', 'L1', 'E2BS', 'E2BS', 'E2BS', 'E2', 'E1^E4', 'E5']
expected = ['E1', 'E2', 'E4', 'E1^E4', 'E6', 'E7', 'L1', 'L2', 'CG', 'E2BS', 'E3']

I want to find the differences between both lists.
I have done

list(set(expected)-set(found))

and

list(set(found)-set(expected))

Which returns ['E3'] and ['E5'] respectively.

However, the answers I need are:

'E3' is missing from found.
'E5' is missing from expected.
There are 2 copies of 'E5' in found.
There are 3 copies of 'E2BS' in found.
There are 2 copies of 'E2' in found.

Any help/suggestions are welcome!

like image 380
Stylize Avatar asked Apr 21 '13 02:04

Stylize


People also ask

How do I find the difference between two lists in Python?

Method 6: Use symmetric_difference to Find the Difference Between Two Lists in Python. The elements that are either in the first set or the second set are returned using the symmetric_difference() technique. The intersection, unlike the shared items of the two sets, is not returned by this technique.

Can you subtract two lists in Python?

Use Numpy to Subtract Two Python Lists One of the methods that numpy provides is the subtract() method. The method takes two numpy array s as input and provides element-wise subtractions between the two lists.


3 Answers

The collections.Counter class will excel at enumerating the differences between multisets:

>>> from collections import Counter
>>> found = Counter(['CG', 'E6', 'E1', 'E2', 'E4', 'L2', 'E7', 'E5', 'L1', 'E2BS', 'E2BS', 'E2BS', 'E2', 'E1^E4', 'E5'])
>>> expected = Counter(['E1', 'E2', 'E4', 'E1^E4', 'E6', 'E7', 'L1', 'L2', 'CG', 'E2BS', 'E3'])
>>> list((found - expected).elements())
['E2', 'E2BS', 'E2BS', 'E5', 'E5']
>>> list((expected - found).elements())

You might also be interested in difflib.Differ:

>>> from difflib import Differ
>>> found = ['CG', 'E6', 'E1', 'E2', 'E4', 'L2', 'E7', 'E5', 'L1', 'E2BS', 'E2BS', 'E2BS', 'E2', 'E1^E4', 'E5']
>>> expected = ['E1', 'E2', 'E4', 'E1^E4', 'E6', 'E7', 'L1', 'L2', 'CG', 'E2BS', 'E3']
>>> for d in Differ().compare(expected, found):
...     print(d)

+ CG
+ E6
  E1
  E2
  E4
+ L2
+ E7
+ E5
+ L1
+ E2BS
+ E2BS
+ E2BS
+ E2
  E1^E4
+ E5
- E6
- E7
- L1
- L2
- CG
- E2BS
- E3
like image 145
Raymond Hettinger Avatar answered Oct 06 '22 00:10

Raymond Hettinger


Leverage the Python set class and Counter class instead of rolling your own solution:

  1. symmetric_difference: finds elements that are either in one set or the other, but not both.
  2. intersection: finds elements in common with the two sets.
  3. difference: which is essentially what you did by subtracting one set from another

Code examples

  • found.difference(expected) # set(['E5'])
    
  • expected.difference(found) # set(['E3'])
    
  • found.symmetric_difference(expected) # set(['E5', 'E3'])
    
  • Finding copies of objects: this question was already referenced. Using that technique gets you all duplicates, and using the resultant Counter object, you can find how many duplicates. For example:

    collections.Counter(found)['E5'] # 2
    
like image 43
Colonel Panic Avatar answered Oct 06 '22 01:10

Colonel Panic


You've already answered the first two:

print('{0} missing from found'.format(list(set(expected) - set(found)))
print('{0} missing from expected'.format(list(set(found) - set(expected)))

The second two require you to look at counting duplicates in lists, for which there are many solutions to be found online (including this one: Find and list duplicates in a list?).

like image 24
Mel Boyce Avatar answered Oct 06 '22 01:10

Mel Boyce