Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find matching key-value pairs of two dictionaries

Tags:

python

What will be the most efficient way to check if key-value pair of one dictionary is present in other dictionary as well.
Suppose if I have two dictionaries as dict1 and dict2 and these two dictionaries have some of the key-value pairs in common. I want to find those and print them. What would be the most efficient way to do this? Please suggest.

like image 808
gliese581g Avatar asked Feb 07 '14 15:02

gliese581g


Video Answer


1 Answers

one way would be:

d_inter = dict([k, v for k, v in dict1.iteritems() if k in dict2 and dict2[k] == v])

the other:

d_inter = dict(set(d1.iteritems()).intersection(d2.iteritems()))

I'm not sure which one would be more efficient, so let's compare both of them:

1. Solution with iteration through dicts:

  • we parse all keys of dict1: for k,v in dict1.iteritems() -> O(n)
  • then we check whether the key is in dict2, if k in dict2 and dict2[k] == v -> O(m)

which makes it a global worst case complexity of O(n+m) -> O(n)

2. Solution with sets:

if we assume that converting a dict into a set is O(n):

  • we parse all items of d1 to create the first set set(d1.iteritems()) -> O(n)
  • we parse all items of d2 to create the second set set(d2.iteritems()) -> O(m)
  • we get the intersetion of both which is O(min(len(s), len(t)) on average or O(n * m) in worst case

which makes it a global worst case complexity of O(2n*n*m) which can be considered as O(n^3) for same sized dicts: then solution 1. is best

if we assume that converting a dict into a set is O(1) (constant time)

the average is O(min(n,m)) and the worst case is O(n*m), then solution #1 is best on worst case scenario, but solution #2 is best on average case scenario because O(n+m) > O(min(n,m)).

In conclusion, the solution you choose will depend on your dataset and the measurements you'll make! ;-)

N.B.: I took there the complexity of the set().

N.B.2: for the solution #1 make always the smallest dict as dict2 and for solution #2 the smallest dict as dict1.


N.B.2016: This solution was written for python2. Here's the changes needed to make it python3 ready:

  • replace iteritems() with items() ;
  • you could also use the newer dict comprehension syntax: {[k, v for … == v]} ;
  • as d.items() returns dict_items which is not hashable anymore, you'd have to use a frozenset() instead {frozenset(d1.items()).intersection(d2.items())}.
like image 197
zmo Avatar answered Oct 01 '22 02:10

zmo