I am curious to know how memory management differs between Bytearray and list in Python.
I have found a few questions like Difference between bytearray and list but not exactly answering my question.
My question precisely ...
from array import array
>>> x = array("B", (1,2,3,4))
>>> x.__sizeof__()
36
>>> y = bytearray((1,2,3,4))
>>> y.__sizeof__()
32
>>> z = [1,2,3,4]
>>> z.__sizeof__()
36
As we can see there is a difference in sizes between list/array.array (36 bytes for 4 elements) and a byte array (32 bytes for 4 elements). Can someone explain to me why is this? It makes sense for byte array that it is occupying 32
bytes of memory for 4
elements ( 4 * 8 == 32 )
, but how can this be interpreted for list and array.array?
# Lets take the case of bytearray ( which makes more sense to me at least :p)
for i in y:
print(i, ": ", id(i))
1 : 499962320
2 : 499962336 #diff is 16 units
3 : 499962352 #diff is 16 units
4 : 499962368 #diff is 16 units
Why does the difference between two contiguous elements differ by 16
units here, when each element occupies only 8
bytes. Does that mean each memory address pointer points to a nibble?
Also what is the criteria for memory allocation for an integer? I read that Python will assign more memory based on the value of the integer (correct me if I am wrong) like the larger the number the more memory.
Eg:
>>> y = 10
>>> y.__sizeof__()
14
>>> y = 1000000
>>> y.__sizeof__()
16
>>> y = 10000000000000
>>> y.__sizeof__()
18
what is the criteria that Python allocates memory?
And why Python is occupying so much more memory while C
only occupies 8 bytes (mine is a 64 bit machine)? when they are perfectly under the range of integer (2 ** 64)
?
Metadata :
Python version : '3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:43:06) [MSC v.1600 32 bit (Intel)]'
Machine arch : 64-bit
P.S : Kindly guide me to a good article where Python memory management is explained better. I had spent almost an hour to figure these things out and ended up asking this Question in SO. :(
Python bytearray() Function The bytearray() function returns a bytearray object. It can convert objects into bytearray objects, or create empty bytearray object of the specified size.
The difference between bytes() and bytearray() is that bytes() returns an object that cannot be modified, and bytearray() returns an object that can be modified.
list is a global name that may be overridden during runtime. list() calls that name. [] is always a list literal.
I'm not claiming this is complete answer, but there are some hints to understanding this.
bytearray
is a sequence of bytes and list
is a sequence of object references. So [1,2,3]
actually holds memory pointers to those integers which are stored in memory elsewhere. To calculate total memory consumption of a list structure, we can do this (I'm using sys.getsizeof
everywhere further, it's calling __sizeof__
plus GC overhead)
>>> x = [1,2,3]
>>> sum(map(getsizeof, x)) + getsizeof(x)
172
Result may be different on different machines.
Also, look at this:
>> getsizeof([])
64
That's because lists are mutable. To be fast, this structure allocates some memory range to store references to objects (plus some storage for meta, such as length of the list). When you append items, next memory cells are filled with references to those items. When there are no room to store new items, new, larger range is allocated, existed data copied there and old one released. This called dynamic arrays.
You can observe this behaviour, by running this code.
import sys
data=[]
n=15
for k in range(n):
a = len(data)
b = sys.getsizeof(data)
print('Length: {0:3d}; Size in bytes: {1:4d}'.format(a, b))
data.append(None)
My results:
Length: 0; Size in bytes: 64
Length: 1; Size in bytes: 96
Length: 2; Size in bytes: 96
Length: 3; Size in bytes: 96
Length: 4; Size in bytes: 96
Length: 5; Size in bytes: 128
Length: 6; Size in bytes: 128
Length: 7; Size in bytes: 128
Length: 8; Size in bytes: 128
Length: 9; Size in bytes: 192
Length: 10; Size in bytes: 192
Length: 11; Size in bytes: 192
Length: 12; Size in bytes: 192
Length: 13; Size in bytes: 192
Length: 14; Size in bytes: 192
We can see that there are 64 bytes was used to store 8 memory addresses (64-bit each).
Almost the same goes with bytearray()
(change second line to data = bytearray()
and append 1 in the last one).
Length: 0; Size in bytes: 56
Length: 1; Size in bytes: 58
Length: 2; Size in bytes: 61
Length: 3; Size in bytes: 61
Length: 4; Size in bytes: 63
Length: 5; Size in bytes: 63
Length: 6; Size in bytes: 65
Length: 7; Size in bytes: 65
Length: 8; Size in bytes: 68
Length: 9; Size in bytes: 68
Length: 10; Size in bytes: 68
Length: 11; Size in bytes: 74
Length: 12; Size in bytes: 74
Length: 13; Size in bytes: 74
Length: 14; Size in bytes: 74
Difference is that memory now used to hold actual byte values, not pointers.
Hope that helps you to investigate further.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With