Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recursive diff of two dictionaries (keys and values)?

So I have a python dictionary, call it d1, and a version of that dictionary at a later point in time, call it d2. I want to find all the changes between d1 and d2. In other words, everything that was added, removed or changed. The tricky bit is that the values can be ints, strings, lists, or dicts, so it needs to be recursive. This is what I have so far:

def dd(d1, d2, ctx=""):     print "Changes in " + ctx     for k in d1:         if k not in d2:             print k + " removed from d2"     for k in d2:         if k not in d1:             print k + " added in d2"             continue         if d2[k] != d1[k]:             if type(d2[k]) not in (dict, list):                 print k + " changed in d2 to " + str(d2[k])             else:                 if type(d1[k]) != type(d2[k]):                     print k + " changed to " + str(d2[k])                     continue                 else:                     if type(d2[k]) == dict:                         dd(d1[k], d2[k], k)                         continue     print "Done with changes in " + ctx     return 

It works just fine unless the value is a list. I cant quite come up with an elegant way to deal with lists, without having a huge, slightly changed version of this function repeated after a if(type(d2) == list).

Any thoughts?

EDIT: This differs from this post because the keys can change

like image 544
Alex Avatar asked May 05 '11 20:05

Alex


People also ask

How do I compare two dictionary keys and values in Python?

Python List cmp() Method. The compare method cmp() is used in Python to compare values and keys of two dictionaries. If method returns 0 if both dictionaries are equal, 1 if dic1 > dict2 and -1 if dict1 < dict2.

Can a dictionary have two keys with the same value two values with the same key?

No, each key in a dictionary should be unique. You can't have two keys with the same value. Attempting to use the same key again will just overwrite the previous value stored. If a key needs to store multiple values, then the value associated with the key should be a list or another dictionary.

Can dictionary have same keys with different values?

You can't. Keys have to be unique.

Can we compare two dictionaries in Python?

You can use the == operator, and it will work. However, when you have specific needs, things become harder. The reason is, Python has no built-in feature allowing us to: compare two dictionaries and check how many pairs are equal.


2 Answers

In case you want the difference recursively, I have written a package for python: https://github.com/seperman/deepdiff

Installation

Install from PyPi:

pip install deepdiff 

Example usage

Importing

>>> from deepdiff import DeepDiff >>> from pprint import pprint >>> from __future__ import print_function # In case running on Python 2 

Same object returns empty

>>> t1 = {1:1, 2:2, 3:3} >>> t2 = t1 >>> print(DeepDiff(t1, t2)) {} 

Type of an item has changed

>>> t1 = {1:1, 2:2, 3:3} >>> t2 = {1:1, 2:"2", 3:3} >>> pprint(DeepDiff(t1, t2), indent=2) { 'type_changes': { 'root[2]': { 'newtype': <class 'str'>,                                  'newvalue': '2',                                  'oldtype': <class 'int'>,                                  'oldvalue': 2}}} 

Value of an item has changed

>>> t1 = {1:1, 2:2, 3:3} >>> t2 = {1:1, 2:4, 3:3} >>> pprint(DeepDiff(t1, t2), indent=2) {'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}} 

Item added and/or removed

>>> t1 = {1:1, 2:2, 3:3, 4:4} >>> t2 = {1:1, 2:4, 3:3, 5:5, 6:6} >>> ddiff = DeepDiff(t1, t2) >>> pprint (ddiff) {'dic_item_added': ['root[5]', 'root[6]'],  'dic_item_removed': ['root[4]'],  'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}} 

String difference

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world"}} >>> t2 = {1:1, 2:4, 3:3, 4:{"a":"hello", "b":"world!"}} >>> ddiff = DeepDiff(t1, t2) >>> pprint (ddiff, indent = 2) { 'values_changed': { 'root[2]': {'newvalue': 4, 'oldvalue': 2},                       "root[4]['b']": { 'newvalue': 'world!',                                         'oldvalue': 'world'}}} 

String difference 2

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world!\nGoodbye!\n1\n2\nEnd"}} >>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n1\n2\nEnd"}} >>> ddiff = DeepDiff(t1, t2) >>> pprint (ddiff, indent = 2) { 'values_changed': { "root[4]['b']": { 'diff': '--- \n'                                                 '+++ \n'                                                 '@@ -1,5 +1,4 @@\n'                                                 '-world!\n'                                                 '-Goodbye!\n'                                                 '+world\n'                                                 ' 1\n'                                                 ' 2\n'                                                 ' End',                                         'newvalue': 'world\n1\n2\nEnd',                                         'oldvalue': 'world!\n'                                                     'Goodbye!\n'                                                     '1\n'                                                     '2\n'                                                     'End'}}}  >>>  >>> print (ddiff['values_changed']["root[4]['b']"]["diff"]) ---  +++  @@ -1,5 +1,4 @@ -world! -Goodbye! +world  1  2  End 

Type change

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}} >>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n\n\nEnd"}} >>> ddiff = DeepDiff(t1, t2) >>> pprint (ddiff, indent = 2) { 'type_changes': { "root[4]['b']": { 'newtype': <class 'str'>,                                       'newvalue': 'world\n\n\nEnd',                                       'oldtype': <class 'list'>,                                       'oldvalue': [1, 2, 3]}}} 

List difference

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3, 4]}} >>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2]}} >>> ddiff = DeepDiff(t1, t2) >>> pprint (ddiff, indent = 2) {'iterable_item_removed': {"root[4]['b'][2]": 3, "root[4]['b'][3]": 4}} 

List difference 2:

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}} >>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}} >>> ddiff = DeepDiff(t1, t2) >>> pprint (ddiff, indent = 2) { 'iterable_item_added': {"root[4]['b'][3]": 3},   'values_changed': { "root[4]['b'][1]": {'newvalue': 3, 'oldvalue': 2},                       "root[4]['b'][2]": {'newvalue': 2, 'oldvalue': 3}}} 

List difference ignoring order or duplicates: (with the same dictionaries as above)

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}} >>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}} >>> ddiff = DeepDiff(t1, t2, ignore_order=True) >>> print (ddiff) {} 

List that contains dictionary:

>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:1, 2:2}]}} >>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:3}]}} >>> ddiff = DeepDiff(t1, t2) >>> pprint (ddiff, indent = 2) { 'dic_item_removed': ["root[4]['b'][2][2]"],   'values_changed': {"root[4]['b'][2][1]": {'newvalue': 3, 'oldvalue': 1}}} 

Sets:

>>> t1 = {1, 2, 8} >>> t2 = {1, 2, 3, 5} >>> ddiff = DeepDiff(t1, t2) >>> pprint (DeepDiff(t1, t2)) {'set_item_added': ['root[3]', 'root[5]'], 'set_item_removed': ['root[8]']} 

Named Tuples:

>>> from collections import namedtuple >>> Point = namedtuple('Point', ['x', 'y']) >>> t1 = Point(x=11, y=22) >>> t2 = Point(x=11, y=23) >>> pprint (DeepDiff(t1, t2)) {'values_changed': {'root.y': {'newvalue': 23, 'oldvalue': 22}}} 

Custom objects:

>>> class ClassA(object): ...     a = 1 ...     def __init__(self, b): ...         self.b = b ...  >>> t1 = ClassA(1) >>> t2 = ClassA(2) >>>  >>> pprint(DeepDiff(t1, t2)) {'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}} 

Object attribute added:

>>> t2.c = "new attribute" >>> pprint(DeepDiff(t1, t2)) {'attribute_added': ['root.c'],  'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}} 
like image 105
Seperman Avatar answered Sep 28 '22 12:09

Seperman


Here's an implementation inspired by Winston Ewert

def recursive_compare(d1, d2, level='root'):     if isinstance(d1, dict) and isinstance(d2, dict):         if d1.keys() != d2.keys():             s1 = set(d1.keys())             s2 = set(d2.keys())             print('{:<20} + {} - {}'.format(level, s1-s2, s2-s1))             common_keys = s1 & s2         else:             common_keys = set(d1.keys())          for k in common_keys:             recursive_compare(d1[k], d2[k], level='{}.{}'.format(level, k))      elif isinstance(d1, list) and isinstance(d2, list):         if len(d1) != len(d2):             print('{:<20} len1={}; len2={}'.format(level, len(d1), len(d2)))         common_len = min(len(d1), len(d2))          for i in range(common_len):             recursive_compare(d1[i], d2[i], level='{}[{}]'.format(level, i))      else:         if d1 != d2:             print('{:<20} {} != {}'.format(level, d1, d2))  if __name__ == '__main__':     d1={'a':[0,2,3,8], 'b':0, 'd':{'da':7, 'db':[99,88]}}     d2={'a':[0,2,4], 'c':0, 'd':{'da':3, 'db':7}}      recursive_compare(d1, d2) 

will return:

root                 + {'b'} - {'c'} root.a               len1=4; len2=3 root.a[2]            3 != 4 root.d.db            [99, 88] != 7 root.d.da            7 != 3 
like image 26
Gabe Avatar answered Sep 28 '22 12:09

Gabe