Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to force Python dictionary to shrink?

I have experienced that in other languages. Now I have the same problem in Python. I have a dictionary that has a lot of CRUD actions. One would assume that deleting elements from a dictionary should decrease the memory footprint of it. It's not the case. Once a dictionary grows in size (doubling usually), it never(?) releases allocated memory back. I have run this experiment:

import random
import sys
import uuid

a= {}
for i in range(0, 100000):
    a[uuid.uuid4()] = uuid.uuid4()
    if i % 1000 == 0:
        print sys.getsizeof(a)

for i in range(0, 100000):
    e = random.choice(a.keys())
    del a[e]
    if i % 1000 == 0:
        print sys.getsizeof(a)

print len(a)

The last line of the first loop is 6291736. The last line of the second loop is 6291736 as well. And the size of the dictionary is 0.

So how to tackle this issue? Is there a way to force release of memory?

PS: don't really need to do random - I played with the range of the second loop.

like image 434
Schultz9999 Avatar asked Jul 15 '15 23:07

Schultz9999


2 Answers

The way to do this "rehashing" so it uses less memory is to create a new dictionary and copy the content over.

The Python dictionary implementation is explained really well in this video:

https://youtu.be/C4Kc8xzcA68

There is an atendee asking this same question (https://youtu.be/C4Kc8xzcA68?t=1593), and the answer given by the speaker is:

Resizes are only calculated upon insertion; as a dictionary shrinks it just gains a lot of dummy entries and as you refill it will just start reusing those to store keys. [...] you have to copy the keys and values out to a new dictionary

like image 151
franciscod Avatar answered Oct 05 '22 04:10

franciscod


Actually a dictionary can shrink upon resize, but the resize only happens upon a key insert not removal. Here's a comment from the CPython source for dictresize:

Restructure the table by allocating a new table and reinserting all items again. When entries have been deleted, the new table may actually be smaller than the old one.

By the way, since the other answer quotes Brandon Rhodes talk on the dictionary at PyCon 2010, and the quote seems to be at odds with the above (which has been there for years), I thought I would include the full quote, with the missing part in bold.

Resizes are only calculated upon insertion. As a dictionary shrinks, it just gains a lot of dummy entries and as you refill it, it will just start re-using those to store keys. It will not resize until you manage to make it two-thirds full again at its larger size. So it does not resize as you delete keys. You have to do an insert to get it to figure out it needs to shrink.

So he does say the resizing operation can "figure out [the dictionary] needs to shrink". But that only happens on insert. Apparently when copying over all the keys during resize, the dummy keys can get removed, reducing the size of the backing array.

It isn't clear, however, how to get this to happen, which is why Rhodes says to just copy over everything to a new dictionary.

like image 30
C S Avatar answered Oct 05 '22 03:10

C S