I would like to compare elements in a dictionary to one another, and remove items according to some comparison criteria. And I'd love it to be efficient. I have a function that can do this, but it repeatedly copies the dictionay. Surely there is a superior way:
mydict = {1:5,2:7,3:9,4:9,7:7,8:0,111:43,110:77}
def partial_duplicate_destroyer(mydict,tolerance):
for key1 in mydict.keys():
mydict_copy = mydict.copy()
for key2 in mydict_copy.keys():
if key2 - tolerance < key1 < key2 + tolerance and not(key1 == key2):
del(mydict[key1])
break
return mydict
print partial_duplicate_destroyer(mydict,2)
print partial_duplicate_destroyer(mydict,20)
print partial_duplicate_destroyer(mydict,200)
#correct output:
# {4: 9, 8: 0, 111: 43}
# {8: 0, 111: 43}
# {111: 43}
This approach could be reduced down to:
from itertools import combinations
def partial_duplicate_destroyer(mydict, tolerance):
#Modifies in-place. Returns only as a convenience. Copy if you don't want this behaviour.
for key1, key2 in combinations(mydict, 2):
if key1 in mydict and key2 - tolerance < key1 < key2 + tolerance:
del mydict[key1]
return mydict
If we try this:
>>> mydict = {1:5,2:7,3:9,4:9,7:7,8:0,111:43,110:77}
>>> partial_duplicate_destroyer(mydict, 2)
{4: 9, 8: 0, 111: 43}
>>> partial_duplicate_destroyer(mydict, 20)
{8: 0, 111: 43}
>>> partial_duplicate_destroyer(mydict, 200)
{111: 43}
>>> mydict
{111: 43}
This uses itertools.combinations()
to produce all possible combinations of keys (without repeating). This is much more efficient as you don't bother working where you have the keys the same, and it's done more efficiently in C rather than your side with Python.
Note that here you are modifying mydict
in place - that is at the end of this, mydict
is {111: 43}
- you need to copy the dict and work on it in the function, rather than directly on it, if you don't want this behaviour. This is what is shown in the last line.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With