I have a dictionary d1
and a list l1
.
The dictionary keys are strings, and the values are Objects I have defined myself. If it helps, I can describe the Object in more detail but for now, the objects have a list attribute names
, and some of the elements of name
may or may not appear in l1
.
What I wanted to do was to throw away any element of the dictionary d1
, in which the name
attribute of the object in said element does not contain any of the elements that appear in l1
.
As a trivial example:
l1 = ['cat', 'dog', 'mouse', 'horse', 'elephant',
'zebra', 'lion', 'snake', 'fly']
d1 = {'1':['dog', 'mouse', 'horse','orange', 'lemon'],
'2':['apple', 'pear','cat', 'mouse', 'horse'],
'3':['kiwi', 'lime','cat', 'dog', 'mouse'],
'4':['carrot','potato','cat', 'dog', 'horse'],
'5':['chair', 'table', 'knife']}
so the resulting dictionary will be more or less the same but the elements of each list will be the key-value pairs from 1
to 4
excluding the fruit and vegetables, and will not contain a 5th key-value par as none of the furniture values appear in l1
.
To do this I used a nested list/dictionary comprehension which looked like this:
d2 = {k: [a for a in l1 if a in d1[k]] for k in d1.keys()}
print(d2)
>>>>{'1': ['dog', 'mouse', 'horse'],
'3': ['cat', 'dog', 'mouse'],
'2': ['cat', 'mouse', 'horse'],
'5': [],
'4': ['cat', 'dog', 'horse']}
d2 = {k: v for k,v in d2.iteritems() if len(v)>0}
print(d2)
>>>>{'1': ['dog', 'mouse', 'horse'],
'3': ['cat', 'dog', 'mouse'],
'2': ['cat', 'mouse', 'horse'],
'4': ['cat', 'dog', 'horse'],}
This seems to work, but for big dictionaries, 7000+ items, it takes around 20 seconds to work through. In and of itself, not horrible, but I need to do this inside a loop that will iterate 10,000 times, so currently it's not feasible. Any suggestions on how to do this quickly?
You are effectively computing the set intersection of each list occuring in the dictionary values with the list l1
. Using lists for set intersections is rather inefficient because of the linear searches involved. You should turn l1
into a set and use set.intersection()
or set membership tests instead (depending on whether it is acceptable that the result is a set again).
The full code could look like this:
l1 = set(l1)
d2 = {k: [s for s in v if s in l1] for k, v in d1.iteritems()}
d2 = {k: v for k, v in d2.iteritems() if v}
Instead of the two dictionary comprehensions, it might also be preferable to use a single for
loop here:
l1 = set(l1)
d2 = {}
for k, v in d1.iteritems():
v = [s for s in v if s in l1]
if v:
d2[k] = v
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With