Memory consumption of a list and set in Python

Tags:

>>> from sys import getsizeof
>>> a=[i for i in range(1000)]
>>> b={i for i in range(1000)}
>>> getsizeof(a)
9024
>>> getsizeof(b)
32992

My question is, why does a set consume so much more memory compared to a list? Lists are ordered, sets are not. Is it an internal structure of a set that consumes memory? Or does a list contain pointers and set does not? Or maybe sys.getsizeof is wrong here? I've seen questions about tuples, lists and dictionaries, but I could not find any comparison between lists and sets.

376

asked Oct 07 '16 09:10

Highstaker

1 Answers

I think it's because of the inherent difference between list and set or dict i.e. the way in which the elements are stored.

List is nothing but a collection of references to the original object. Suppose you create 1000 integers, then 1000 integer objects are created and the list only contains the reference to these objects.

On the other hand, set or dictionary has to compute the hash value for these 1000 integers and the memory is consumed according to the number of elements.

For ex: In both set and dict, by default, the smallest size is 8 (that is, if you are only storing 3 values, python will still allocate 8 elements). On resize, the number of buckets increases by 4x until we reach 50,000 elements, after which the size is increased by 2x. This gives the following possible sizes,

16, 64, 256, 1024, 4096, 16384, 65536, 131072, 262144, ...

Some examples:

In [26]: a=[i for i in range(60000)]
In [27]: b={i for i in range(60000)}

In [30]: b1={i for i in range(100000)}
In [31]: a1=[i for i in range(100000)]

In [32]: getsizeof(a)
Out[32]: 514568
In [33]: getsizeof(b)
Out[33]: 2097376

In [34]: getsizeof(a1)
Out[34]: 824464
In [35]: getsizeof(b1)
Out[35]: 4194528

Answers: Yes, it's the internal structure in the way set stores the elements consumes this much memory. And, sys.getsizeof is correct only; There's nothing wrong with using that here.

For more detailed reference about list, set or dict please refer this chapter: High Performance Python

128

answered Oct 06 '22 21:10

kmario23

Related questions
                            
                                Java List : initial size
                            
                                Linq - operating on lists of lists
                            
                                Why should I take List<> instead of array in JSON deserialization ?
                            
                                remove element in list in a dictionary
                            
                                assigning two variables to one list slice
                            
                                List.AddRange with IEnumerable<T> parameter not working?
                            
                                Randomly select x number of items from class list in python
                            
                                Replacing selected elements in a list in Python
                            
                                Inserting into a list at a specific location using lenses
                            
                                Return a set instead of list with hibernate Criteria
                            
                                irregular list of lists to dataframe
                            
                                How to make a nested toSet in scala in an idiomatic way?
                            
                                How to find multiple max value items in a list [duplicate]
                            
                                Get a list of values from a list of dictionaries?
                            
                                Cannot implicitly convert type 'int' to 'System.Collections.Generic.List<QuickTest.Stock>'
                            
                                Creating List From File In Python
                            
                                What type is <?> when making instantiating lists?
                            
                                Find index of an array in List of arrays in C#
                            
                                Using bulk_insert_mappings
                            
                                How to concatenate observable lists in JavaFX?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Memory consumption of a list and set in Python

Tags:

list

memory

python-3.x

data-structures

set

Highstaker

People also ask

1 Answers

kmario23

Recent Activity

Donate For Us