Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the time complexity of popping an element from a dict in Python?

This is kind of a combinatorics problem. I have a very large list, all_possible—which stores all of the possible values—and a very large dict, present, which stores all of the actually present values, e.g.:

all_possible = [1, 2, 3, 4, 5]
present = {1: 'some_value', 3: 'some_other_value'}

Currently, my search looks a bit like:

for key in all_possible:
    value = present.get(key, None)
    if value is None:
        do_something_if_key_not_present()
    else:
        do_something_if_key_is_present()

This works nicely, since for each iteration there is only one lookup in the dictionary, and the average lookup time for a Python dictionary is O(1).

However, amortized worst case lookup time is O(N), and since the dictionary can be so huge (millions of elements), one of the optimizations I've considered involves popping the elements from the dict as I traverse (so subsequent lookups have a smaller search space):

for key in all_possible:
    value = present.pop(key, None)  # this line changes, dict shrinks
    if value is None:
        do_something_if_key_not_present()
    else:
        do_something_if_key_is_present()

My question is what is the time complexity of the dictionary pop? I know that average case pop operations in structures like list are O(N), but I cannot find any reliable documentation that denotes the complexity of popping from a dict. If it ends up being O(1), this may speed up my search, but if it's any higher, I might be hurting myself.

like image 375
TayTay Avatar asked May 16 '17 19:05

TayTay


Video Answer


1 Answers

The time complexity of dict.pop is exactly the same as that of dict.get. list.pop is O(N) because it needs to shift elements, but dict.pop doesn't do that.

That said, dict.pop probably won't improve your lookup time. Deleting a key from a dict needs to leave a DKIX_DUMMY marker in its place, and the lookup routine needs to treat any DKIX_DUMMY it finds as if it were a hash collision, and keep going. At best, you'll save some == comparisons on deleted keys.

Even if dict.pop were an improvement, it wouldn't save you from O(N) worst-case lookup time. If you need to deal with adversarial key choices, dicts may not be a good fit for your use case.

like image 133
user2357112 supports Monica Avatar answered Sep 22 '22 04:09

user2357112 supports Monica