Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pre-allocate memory for a dictionary?

In python 3.4, I am trying to populate a dictionary in a large loop to allocate 30000 * 1000 double numbers to it. I would like to allocate memory for the dictionary beforehand so that I can reduce the performance overhead caused by allocating memory in each iteration.

Also, how to check the limit of memory size that is allowed to allocate to a dictionary (and list) in python? For example, if it only allows 50MB, I will try to avoid overflow. This may depends on operating systems and others but I would like to have an idea about how to maximize performance.

I can use

ll = [None] * 1000

to allocate memory for a list.

Is there a similar way to do for a dictionary ?

d = {None} * 1000 ? 
or 
d = {None: None} * 1000 ? 

thanks

like image 439
Lily Avatar asked Feb 05 '23 14:02

Lily


1 Answers

Pre-allocating the list ensures that the allocated index values will work. I assume that's what you mean by preallocating a dict. In that case:

d = dict.fromkeys(range(1000))

or use any other sequence of keys you have handy. If you want to preallocate a value other than None you can do that too:

d = dict.fromkeys(range(1000), 0)

Edit as you've edited your question to clarify that you meant to preallocate the memory, then the answer to that question is no, you cannot preallocate the memory, nor would it be useful to do that. Most of the memory used isn't the dictionary itself, it will be the objects used as keys and values. The dictionary itself allocates memory in a way that is effectively constant time (so it starts off small but then resizes in larger chunks in such a way that the overall time is effectively constant).

Allocating 30 million objects to a dictionary will require approximately 120MB or 240MB for the dict itself but the individual objects will require a lot more so unless you have a lot of RAM in your system I would think it will be the content of the dictionary that gives you a problem rather than the dictionary itself.

If you fire up the interactive prompt you'll find that it only takes a few seconds to run this:

>>> d = dict.fromkeys(range(30000000))
>>> import sys
>>> sys.getsizeof(d)
1610613016

So 1,610,613,016 bytes (1.5GB) for a dictionary that contains only integer keys and all the values are None. Store unique values as well and you've double the size if they're just integers as well but if they're strings or complex objects your memory consumption will be very high.

like image 149
Duncan Avatar answered Feb 08 '23 05:02

Duncan