Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memory optimization / Interning in python

Referring to the following output from the python:

>>> x=254
>>> y=254
>>> id(x)
2039624591696  --> same as that of y
>>> id(y)
2039624591696  --> same as that of x
>>> x=300
>>> y=300
>>> id(x)
2039667477936 ---> different than y when value exceeds a limit of 256 
>>> id(y)
2039667477968 ----> 
>>> str7='g'*4096
>>> id(str7)
2039639279632  ---> same as that of str8
>>> str8='g'*4096
>>> id(str8)
2039639279632 ---> same as that of str7
>>> str9='g'*4097
>>> id(str9)
2039639275392 ----> ---> content is same as that of str10 but address is different than that of str10
>>> str10='g'*4097
>>> id(str10)
2039639337008

Here, as I define the str9 as 'g'*4097 it takes a different memory address than the str10, it seems there is some limit here, now my question is to find out these limits for the particular python release.

like image 938
anuraag Avatar asked Dec 21 '25 19:12

anuraag


1 Answers

Which integers and strings that get automatically interned in Python is implementation specific, and has changed between versions.

Here are some principles and limits that seem to hold at least for my current installation (CPython 3.10.7):

All integers in the range [-5, 256] are automatically interned:

>>> x = 256
>>> y = 256
>>> x is y
True
>>> x = 257
>>> y = 257
>>> x is y
False

CPython (version >= 3.7) also automatically interns strings if they are <= 4096 characters long, and only consist of ASCII letters, digits, and underscores. (In CPython versions <= 3.6, the limit was 20 characters).

>>> x = "foo"
>>> y = "foo"
>>> x is y
True
>>> x = "foo bar"
>>> y = "foo bar"
>>> x is y
False
>>> x = "A" * 4096
>>> y = "A" * 4096
>>> x is y
True
>>> x = "A" * 4097
>>> y = "A" * 4097
>>> x is y
False

In some versions the rule was apparently to intern strings looking like valid identifiers (e.g., not strings starting with a digit), but that does not appear to be the rule in my installation:

>>> x = "5myvar"
>>> y = "5myvar"
>>> x is y
True
>>> 5myvar = 5
  File "<stdin>", line 1
    5myvar = 5
    ^
SyntaxError: invalid decimal literal

Additionally, strings are interned at compile time, not at runtime:

>>> x = "bar"
>>> y = "".join(["b","a","r"])
>>> x
'bar'
>>> y
'bar'
>>> x is y
False

Relying on automatic string interning is risky (it depends on the implementation, which may change). To ensure a string is interned you can use the sys.intern() function:

>>> x = "a string which would not normally be interned!"
>>> y = "a string which would not normally be interned!"
>>> x is y
False
>>> import sys
>>> x = sys.intern(x)
>>> y = sys.intern(y)
>>> x is y
True
like image 177
Anders Gorm Avatar answered Dec 23 '25 08:12

Anders Gorm



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!