I want to know how Python knows (if it knows) that a value-type object is already stored in its memory (and also knows where it is).
For this code, when assigning the value 1
for b
, how does it know that the value 1
is already in its memory and stores its reference in b
?
>>> a = 1
>>> b = 1
>>> a is b
True
Python stores object in heap memory and reference of object in stack. Variables, functions stored in stack and object is stored in heap.
Python has a built-in module named 'array' which is similar to arrays in C or C++. In this container, the data is stored in a contiguous block of memory. Just like arrays in C or C++, these arrays only support one data type at a time, therefore it's not heterogenous like Python lists.
In CPython, which is what most people use when they use python , all Python objects are represented by a C struct, PyObject .
Python, however, doesn't use a fixed number of bit to store integers. Instead, Python uses a variable number of bits to store integers. For example, 8 bits, 16 bits, 32 bits, 64 bits, 128 bits, and so on. The maximum integer number that Python can represent depends on the memory available.
Python (CPython precisely) uses shared small integers to help quick access. Integers range from [-5, 256] already exists in memory, so if you check the address, they are the same. However, for larger integers, it's not true.
a = 100000
b = 100000
a is b # False
Wait, what? If you check the address of the numbers, you'll find something interesting:
a = 1
b = 1
id(a) # 4463034512
id(b) # 4463034512
a = 257
b = 257
id(a) # 4642585200
id(b) # 4642585712
It's called integer cache. You can read more about the integer cache here.
Thanks comments from @KlausD and @user2357112 mentioning, direct access on small integers will be using integer cache, while if you do calculations, though they might equals to a number in range [-5, 256], it's not a cached integer. e.g.
pow(3, 47159012670, 47159012671) is 1 # False
pow(3, 47159012670, 47159012671) == 1 # True
“The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object.”
Why? Because small integers are more frequently used by loops. Using reference to existing objects instead of creating a new object saves an overhead.
If you take a look at Objects/longobject.c
, which implements the int
type for CPython, you will see that the numbers between -5 (NSMALLNEGINTS
) and 256 (NSMALLPOSINTS - 1
) are pre-allocated and cached. This is done to avoid the penalty of allocating multiple unnecessary objects for the most commonly used integers. This works because integers are immutable: you don't need multiple references to represent the same number.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With