How exactly is dict implemented that it has a linear time lookup for collisions? I would assume that it is implemented as a hashtable backed by a list. I would presume that a better implementation would be O(log(n)) for various operations, using a tree to back the table instead. Is there some magic happening behind the scenes to keep the constant time lookups alive for as long as possible?
My source for this, by the way, is this:
http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=python+complexity
O(1) means constant without regard to the size of the data. The hash function takes a certain amount of time, but that amount of time doesn't scale with the size of the collection.
If we explain the difference by Big O concepts, dictionaries have constant time complexity, O(1) while lists have linear time complexity, O(n).
The average time complexity is of course O(1). The best method would be to check and take a look at the hashs of the objects you are using. The CPython Dict uses int PyObject_Hash (PyObject *o) which is the equivalent of hash(o) .
The reason is because a dictionary is a lookup, while a list is an iteration. Dictionary uses a hash lookup, while your list requires walking through the list until it finds the result from beginning to the result each time.
Dict is O(1) for most operations, except for operations that touch all elements, such as iteration and copy (in which case, it's obviously O(n)).
See: http://wiki.python.org/moin/TimeComplexity
It has O(n) worst case, because you can always contrive a pathological example where all the keys have the same hash value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With