python key in dict.keys() performance for large dictionaries

Question

I was wondering if you guys might be able to give me some advice in regards to making the performance of my code much better.

I have a set of for loops which look to see if a key is in a dictionary of which its values are a list, if the key exists, it appends to the list and if it doesnt it adds a new list in for that key

dict={}
for value in value_list:
   if value.key in dict.keys():
      temp_list = dict[value.key]
      temp_list.append(value.val)
      dict[value.key] = temp_list
   else:
      dict[value.key] = [value.val]

Now this code works fine, but evenrually as the dictionary starts to fill up the line value.key in dict.keys() becomes more and more cumbersome.

Is there a better way of doing this?

Thanks,

Mike

Glenn Maynard · Accepted Answer

Don't do this:

value.key in dict.keys()

That--in Python 2, at least--creates a list containing every key. That gets more and more expensive as the dictionary gets larger, and performs an O(n) search on the list to find the key, which defeats the purpose of using a dict.

Instead, just do:

value.key in dict

which doesn't create a temporary list, and does a hash table lookup for the key rather than a linear search.

setdefault, as mentioned elsewhere, is the cleaner way to do this, but it's very important to understand the above.

Sven Marnach · Answer

Using collections.defaultdict, this can be simplified to

d = collections.defaultdict(list)
for value in value_list:
    d[value.key].append(value.val)

python key in dict.keys() performance for large dictionaries

Tags:

python

Werda

2 Answers

Glenn Maynard

Sven Marnach

Recent Activity

Donate For Us

python key in dict.keys() performance for large dictionaries

Tags:

python

Werda

2 Answers

Glenn Maynard

Sven Marnach

Related questions

Recent Activity

Donate For Us