I was wondering if you guys might be able to give me some advice in regards to making the performance of my code much better.
I have a set of for loops which look to see if a key is in a dictionary of which its values are a list, if the key exists, it appends to the list and if it doesnt it adds a new list in for that key
dict={}
for value in value_list:
if value.key in dict.keys():
temp_list = dict[value.key]
temp_list.append(value.val)
dict[value.key] = temp_list
else:
dict[value.key] = [value.val]
Now this code works fine, but evenrually as the dictionary starts to fill up the line value.key in dict.keys() becomes more and more cumbersome.
Is there a better way of doing this?
Thanks,
Mike
Don't do this:
value.key in dict.keys()
That--in Python 2, at least--creates a list containing every key. That gets more and more expensive as the dictionary gets larger, and performs an O(n) search on the list to find the key, which defeats the purpose of using a dict.
Instead, just do:
value.key in dict
which doesn't create a temporary list, and does a hash table lookup for the key rather than a linear search.
setdefault
, as mentioned elsewhere, is the cleaner way to do this, but it's very important to understand the above.
Using collections.defaultdict
, this can be simplified to
d = collections.defaultdict(list)
for value in value_list:
d[value.key].append(value.val)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With