Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compare two lists of dicts in Python?

Tags:

python

How do I compare two lists of dict? The result should be the odd ones out from the list of dict B.

Example:

ldA = [{'user':"nameA", 'a':7.6, 'b':100.0, 'c':45.5, 'd':48.9},
       {'user':"nameB", 'a':46.7, 'b':67.3, 'c':0.0, 'd':5.5}]


ldB =[{'user':"nameA", 'a':7.6, 'b':99.9, 'c':45.5, 'd':43.7},
      {'user':"nameB", 'a':67.7, 'b':67.3, 'c':1.1, 'd':5.5},
      {'user':"nameC", 'a':89.9, 'b':77.3, 'c':2.2, 'd':6.5}]

Here I want to compare ldA with ldB. It should print the below output.

ldB -> {user:"nameA",  b:99.9, d:43.7}
ldB -> {user:"nameB",  a:67.7, c:1.1 }
ldb -> {user:"nameC", a:89.9, b:77.3, c:2.2, d:6.5}

I have gone through the below link, but there it return onlys the name, but I want name and value like above.

List of Dicts comparision to match between lists and detect value changes in Python

like image 827
newbe Avatar asked Jun 13 '11 16:06

newbe


4 Answers

For a general solution, consider the following. It will properly diff, even if the users are out of order in the lists.

def dict_diff ( merge, lhs, rhs ):
    """Generic dictionary difference."""
    diff = {}
    for key in lhs.keys():
          # auto-merge for missing key on right-hand-side.
        if (key not in rhs):
            diff[key] = lhs[key]
          # on collision, invoke custom merge function.
        elif (lhs[key] != rhs[key]):
            diff[key] = merge(lhs[key], rhs[key])
    for key in rhs.keys():
          # auto-merge for missing key on left-hand-side.
        if (key not not lhs):
            diff[key] = rhs[key]
    return diff

def user_diff ( lhs, rhs ):
    """Merge dictionaries using value from right-hand-side on conflict."""
    merge = lambda l,r: r
    return dict_diff(merge, lhs, rhs)

import copy

def push ( x, k, v ):
    """Returns copy of dict `x` with key `k` set to `v`."""
    x = copy.copy(x); x[k] = v; return x

def pop ( x, k ):
    """Returns copy of dict `x` without key `k`."""
    x = copy.copy(x); del x[k]; return x

def special_diff ( lhs, rhs, k ):
      # transform list of dicts into 2 levels of dicts, 1st level index by k.
    lhs = dict([(D[k],pop(D,k)) for D in lhs])
    rhs = dict([(D[k],pop(D,k)) for D in rhs])
      # diff at the 1st level.
    c = dict_diff(user_diff, lhs, rhs)
      # transform to back to initial format.
    return [push(D,k,K) for (K,D) in c.items()]

Then, you can check the solution:

ldA = [{'user':"nameA", 'a':7.6, 'b':100.0, 'c':45.5, 'd':48.9},
       {'user':"nameB", 'a':46.7, 'b':67.3, 'c':0.0, 'd':5.5}]
ldB =[{'user':"nameA", 'a':7.6, 'b':99.9, 'c':45.5, 'd':43.7},
      {'user':"nameB", 'a':67.7, 'b':67.3, 'c':1.1, 'd':5.5},
      {'user':"nameC", 'a':89.9, 'b':77.3, 'c':2.2, 'd':6.5}]
import pprint
if __name__ == '__main__':
    pprint.pprint(special_diff(ldA, ldB, 'user'))
like image 66
André Caron Avatar answered Oct 13 '22 03:10

André Caron


My approach: build a lookup based on ldA of values to exclude, then determine the result of excluding the appropriate values from each list in ldB.

lookup = dict((x['user'], dict(x)) for x in ldA)
# 'dict(x)' is used here to make a copy
for v in lookup.values(): del v['user']

result = [
    dict(
        (k, v)
        for (k, v) in item.items()
        if item['user'] not in lookup or lookup[item['user']].get(k, v) == v
    )
    for item in ldB
]

You should, however, be aware that comparing floating-point values like that can't be relied upon.

like image 24
Karl Knechtel Avatar answered Oct 13 '22 05:10

Karl Knechtel


I am going to assume that the corresponding dicts are in the same order in both lists.

Under that assumption, you can use the following code:

def diffs(L1, L2):
    answer = []
    for i, d1 in enumerate(L1):
        d = {}
        d2 = L2[i]
        for key in d1:
            if key not in d1:
                print key, "is in d1 but not in d2"
            elif d1[key] != d2[key]:
                d[key] = d2[key]
        answer.append(d)
    return answer

Untested. Please comment if there are errors and I will fix them

like image 38
inspectorG4dget Avatar answered Oct 13 '22 05:10

inspectorG4dget


One more solution a bit weird(sorry if i miss something) but it also allows you to configure your own equality check(you simply need to modify isEqual lambda for this) as well as give you two different options on how to deal in case when keys differ:

ldA = [{'user':"nameA", 'a':7.6, 'b':100.0, 'c':45.5, 'd':48.9},
       {'user':"nameB", 'a':46.7, 'b':67.3, 'c':0.0, 'd':5.5}]


ldB =[{'user':"nameA", 'a':7.6, 'b':99.9, 'c':45.5, 'd':43.7},
      {'user':"nameB", 'a':67.7, 'b':67.3, 'c':1.1, 'd':5.5},
      {'user':"nameC", 'a':89.9, 'b':77.3, 'c':2.2, 'd':6.5}]

ldA.extend((ldB.pop() for i in xrange(len(ldB)))) # get the only one list here

output = []

isEqual = lambda x,y: x != y # add your custom equality check here, for example rounding values before comparison and so on

while len(ldA) > 0: # iterate through list
    row = ldA.pop(0) # get the first element in list and remove it from list
    for i, srow in enumerate(ldA):
        if row['user'] != srow['user']:
            continue
        res = {'user': srow['user']} #
        # next line will ignore all keys of srow which are not in row 
        res.update(dict((key,val) for key,val in ldA.pop(i).iteritems() if key in row and isEqual(val, row[key])))
        # next line will include the srow.key and srow.value into the results even in a case when there is no such pair in a row
        #res.update(dict(filter(lambda d: isEqual(d[1], row[d[0]]) if d[0] in row else True ,ldA.pop(i).items())))
        output.append(res)
        break
    else:
        output.append(row)

print output
like image 1
Artsiom Rudzenka Avatar answered Oct 13 '22 04:10

Artsiom Rudzenka