Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why must dictionary keys be immutable?

Tags:

Why is it necessary for dictionary keys to be immutable? I'm looking for a simple, clear reason why keys in Python dictionaries have that restriction.

like image 557
sam Avatar asked Jun 14 '14 07:06

sam


People also ask

Is dictionary keys must be immutable?

Second, a dictionary key must be of a type that is immutable. For example, you can use an integer, float, string, or Boolean as a dictionary key. However, neither a list nor another dictionary can serve as a dictionary key, because lists and dictionaries are mutable.

What is an immutable key?

The primary key should be immutable, meaning that its value should not be changed during the course of normal operations of the database. (Recall that a primary key is the means of uniquely identifying a tuple, and that identity, by definition, never changes.)


1 Answers

On my computer, there's a file /etc/dictionaries-common/words containing a large collection of English words:

>>> with open("/etc/dictionaries-common/words") as f: ...     words = [line.strip() for line in f] ...  >>> "python" in words True >>> "BDFL" in words False 

Let's create a dictionary storing the lengths of all those words:

>>> word_lengths = {w: len(w) for w in words} >>> word_lengths["parrot"] 6 

And, just for kicks, we'll shuffle our original word list:

>>> from random import shuffle >>> shuffle(words) >>> words[:5] ["Willie's", 'Araceli', 'accessed', 'engagingly', 'hobnobs'] 

Mmm, hobnobs. Anyway ... now that we've messed around a bit with words, we've become a little bit paranoid (possibly for the same reason that we're craving hobnobs), and we want to check that all the words in our word_lengths dictionary are still in words after we mixed it all up:

>>> all(w in words for w in word_lengths) True 

Well, we got there, but on my machine that took over three minutes - enough time to eat a couple more delicious biscuits, at least. Thinking about it, it's obvious why: we've got ...

>>> len(words) 99171 

... nearly a hundred thousand words to check, and for each one in the dictionary, Python has to search through our mixed-up list of words until it finds a match. It won't always have to check the whole list, but on average that's going to be fifty thousand words (or half the list) each time, for a total of 50,000 × 100,000 = 5,000,000,000 tests. Five billion is a lot, even in this miraculous age of technology.

Just to be absolutely sure (I'm not normally so paranoid; normally I just get sleepy), let's check the other way around, and make sure that everything in words is still in word_lengths:

>>> all(w in word_lengths for w in words) True 

Hey, what? This time it was, like, a tenth of a second! What gives? You're freaking me out, man ... and hey, where are my biscuits? I had them just now, I'm sure of it.

Unlike a list, which can be in any old order (so making sure that some item is in there means checking each item in turn until we find it), a dictionary is a bit more efficient. Probably less fun at parties, but hey, leave it in charge of the music and all is copacetic, y'know?

The secret of dictionaries' ruthless efficiency is that for each item, the dictionary calculates a hash (just an integer, really) of the key based on its content, and uses that to store the item at a specific location in memory. Then, when you go looking for the item, it calculates the hash of the key's content again, says to itself "okay, "python", that hashes to 7036520087640895475 ... yeah, I know where I must have put that, then", and goes straight to the right memory location to find it. So this time, it only had to do a hundred thousand checks rather than five billion.

It's kinda like having all your CDs neatly alphabetised on shelves, rather than stacked randomly out of their cases on top of your speakers. Dictionaries know where it's at, I'm telling you.

But there's a price to pay for dictionaries' ability to keep it together. Remember when I said that the dictionary calculates a hash based on the item's content? Well, what happens if that content changes? For immutable objects that's not a problem - their content can't change - but mutable objects, by definition, can change their contents, and when they do, their hash (if they even have one) will change too. Which is cool, obviously, not everyone wants to be put in a box, I get that, but if the hash has changed, there's no way for the dictionary to work out where it put the thing.

It's as though Joy Division changed their name to New Order, and now you've got no idea where you put that 12" remix of Blue Monday. It's just not gonna work.

So, dictionaries have a rule: if you want to be a key, don't go changing.

like image 190
Zero Piraeus Avatar answered Sep 25 '22 19:09

Zero Piraeus