Let's say I have two dictionaries and I know want to measure the time needed to check if a key is in the dictionary. I tried to run this piece of code:
from timeit import timeit
dct1 = {str(i): 1 for i in range(10**7)}
dct2 = {i: 1 for i in range(10**7)}
print(timeit('"7" in dct1', setup='from __main__ import dct1', number=10**8))
print(timeit('7 in dct2', setup='from __main__ import dct2', number=10**8))
Here are the results that I get:
2.529034548999334
2.212983401999736
Now, let's say I try to mix integers and strings in both dictionaries, and measure access time again:
dct1[7] = 1
dct2["7"] = 1
print(timeit('"7" in dct1', setup='from __main__ import dct1', number=10**8))
print(timeit('7 in dct1', setup='from __main__ import dct1', number=10**8))
print(timeit('7 in dct2', setup='from __main__ import dct2', number=10**8))
print(timeit('"7" in dct2', setup='from __main__ import dct2', number=10**8))
I get something weird:
3.443614432000686
2.6335261530002754
2.1873921409987815
2.272667104998618
The first value is much higher than what I had before (3.44 vs 2.52). However, the third value is basically the same as before (2.18 vs 2.21). Why is this happening? Can you reproduce the same thing or is this only me? Also, I can't understand the big difference between the first and the second value: it looks like it's more difficult to access a string key, but the same thing seems to apply only slightly to the second dictionary. Why?
Update
You don't even need to actually add a new key. All you need to do to see an increase in complexity is just checking if a key with different type exists!! This is much weirder than I thought. Look at the example here:
from timeit import timeit
dct1 = {str(i): 1 for i in range(10**7)}
dct2 = {i: 1 for i in range(10**7)}
print(timeit('"7" in dct1', setup='from __main__ import dct1', number=10**8))
# 2.55
print(timeit('7 in dct2', setup='from __main__ import dct2', number=10**8))
# 2.26
7 in dct1
"7" in dct2
print(timeit('"7" in dct1', setup='from __main__ import dct1', number=10**8))
# 3.34
print(timeit('7 in dct2', setup='from __main__ import dct2', number=10**8))
# 2.35
Keys are unique within a dictionary while values may not be. The values of a dictionary can be of any type, but the keys must be of an immutable data type such as strings, numbers, or tuples.
Second, a dictionary key must be of a type that is immutable. For example, you can use an integer, float, string, or Boolean as a dictionary key. However, neither a list nor another dictionary can serve as a dictionary key, because lists and dictionaries are mutable.
Python dictionary is a container of key-value pairs. It is mutable and can contain mixed types. A dictionary is an unordered collection. Python dictionaries are called associative arrays or hash tables in other languages.
No, each key in a dictionary should be unique. You can't have two keys with the same value. Attempting to use the same key again will just overwrite the previous value stored. If a key needs to store multiple values, then the value associated with the key should be a list or another dictionary.
The dictionary class has a method named items. We can access the items method and iterate over it getting each pair of key and value.
Using a for loop we can access both the key and value at each of the index position in dictionary as soon in the below program. Running the above code gives us the following result − In this approach we consider the key similar to an index in a list. So in the print statement we represent the keys and the values as a pair along with the for loop.
If you specify a key a second time during the initial creation of a dictionary, then the second occurrence will override the first. Second, a dictionary key must be of a type that is immutable.
But there are a few restrictions I want to briefly talk about. Keys in a dictionary can only be used once. If it is used more than once, as you saw earlier, it’ll simply replace the value. 00:19 A key must be immutable—that is, unable to be changed. These are things like integers, floats, strings, Booleans, functions.
Let me try to answer my own question. The dict implementation in CPython is optimised for lookups of str keys. Indeed, there are two different functions that are used to perform lookups:
lookdict
is a generic dictionary lookup function that is used with all types of keyslookdict_unicode
is a specialised lookup function used for dictionaries composed of str-only keysPython will use the string-optimised version until a search for non-string data, after which the more general function is used.
And it looks like you cannot even reverse the behaviour of a particular dict instance: once it starts using the generic function, you can't go back to using the specialised one!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With