If it is environment-independent, what is the theoretical maximum number of characters in a Python string?
Identifiers and keywords from The Python Language Reference: Identifiers are unlimited in length. But you'll be violating PEP-8 most likely, which is not really cool: Limit all lines to a maximum of 79 characters.
A character-string value is a sequence of characters. The number of characters in a sequence is called the length of the sequence. In Open PL/I, the maximum length of a string value is 32767 bytes or characters. A character-string of zero length is called a null string.
With a 64-bit Python installation, and (say) 64 GB of memory, a Python 2 string of around 63 GB should be quite feasible (if not maximally fast). If you can upgrade your memory much beyond that (which will cost you an arm and a leg, of course), your maximum feasible strings should get proportionally longer. (I don't recommend relying on virtual memory to extend that by much, or your runtimes will get simply ridiculous;-).
With a typical 32-bit Python installation, of course, the total memory you can use in your application is limited to something like 2 or 3 GB (depending on OS and configuration), so the longest strings you can use will be much smaller than in 64-bit installations with ridiculously high amounts of RAM.
I ran this code on an EC2 instance.
def create1k(): s = "" for i in range(1024): s += '*' return s def create1m(): s = "" x = create1k() for i in range(1024): s += x return s def create1g(): s = "" x = create1m() for i in range(1024): s += x return s print("begin") s = "" x = create1g() for i in range(1024): s += x print(str(i) + "g ok") print(str(len(s)) + ' bytes')
and this is the output
[ec2-user@ip-10-0-0-168 ~]$ time python hog.py begin 0g ok 1073741824 bytes 1g ok 2147483648 bytes 2g ok 3221225472 bytes 3g ok 4294967296 bytes 4g ok 5368709120 bytes 5g ok 6442450944 bytes 6g ok 7516192768 bytes 7g ok 8589934592 bytes 8g ok 9663676416 bytes 9g ok 10737418240 bytes 10g ok 11811160064 bytes 11g ok 12884901888 bytes 12g ok 13958643712 bytes 13g ok 15032385536 bytes 14g ok 16106127360 bytes 15g ok 17179869184 bytes 16g ok 18253611008 bytes 17g ok 19327352832 bytes 18g ok 20401094656 bytes 19g ok 21474836480 bytes 20g ok 22548578304 bytes 21g ok 23622320128 bytes 22g ok 24696061952 bytes 23g ok 25769803776 bytes 24g ok 26843545600 bytes 25g ok 27917287424 bytes 26g ok 28991029248 bytes 27g ok 30064771072 bytes 28g ok 31138512896 bytes 29g ok 32212254720 bytes 30g ok 33285996544 bytes 31g ok 34359738368 bytes 32g ok 35433480192 bytes 33g ok 36507222016 bytes 34g ok 37580963840 bytes 35g ok 38654705664 bytes 36g ok 39728447488 bytes 37g ok 40802189312 bytes 38g ok 41875931136 bytes 39g ok 42949672960 bytes 40g ok 44023414784 bytes 41g ok 45097156608 bytes 42g ok 46170898432 bytes 43g ok 47244640256 bytes 44g ok 48318382080 bytes 45g ok 49392123904 bytes 46g ok 50465865728 bytes 47g ok 51539607552 bytes 48g ok 52613349376 bytes 49g ok 53687091200 bytes 50g ok 54760833024 bytes 51g ok 55834574848 bytes 52g ok 56908316672 bytes 53g ok 57982058496 bytes 54g ok 59055800320 bytes 55g ok 60129542144 bytes 56g ok 61203283968 bytes 57g ok 62277025792 bytes 58g ok 63350767616 bytes 59g ok 64424509440 bytes 60g ok 65498251264 bytes 61g ok 66571993088 bytes 62g ok 67645734912 bytes 63g ok 68719476736 bytes 64g ok 69793218560 bytes 65g ok 70866960384 bytes 66g ok 71940702208 bytes 67g ok 73014444032 bytes 68g ok 74088185856 bytes 69g ok 75161927680 bytes 70g ok 76235669504 bytes 71g ok 77309411328 bytes 72g ok 78383153152 bytes 73g ok 79456894976 bytes 74g ok 80530636800 bytes 75g ok 81604378624 bytes 76g ok 82678120448 bytes 77g ok 83751862272 bytes 78g ok 84825604096 bytes 79g ok 85899345920 bytes 80g ok 86973087744 bytes 81g ok 88046829568 bytes 82g ok 89120571392 bytes 83g ok 90194313216 bytes 84g ok 91268055040 bytes 85g ok 92341796864 bytes 86g ok 93415538688 bytes 87g ok 94489280512 bytes 88g ok 95563022336 bytes 89g ok 96636764160 bytes 90g ok 97710505984 bytes 91g ok 98784247808 bytes 92g ok 99857989632 bytes 93g ok 100931731456 bytes 94g ok 102005473280 bytes 95g ok 103079215104 bytes 96g ok 104152956928 bytes 97g ok 105226698752 bytes 98g ok 106300440576 bytes 99g ok 107374182400 bytes 100g ok 108447924224 bytes 101g ok 109521666048 bytes 102g ok 110595407872 bytes 103g ok 111669149696 bytes 104g ok 112742891520 bytes 105g ok 113816633344 bytes 106g ok 114890375168 bytes 107g ok 115964116992 bytes 108g ok 117037858816 bytes 109g ok 118111600640 bytes 110g ok 119185342464 bytes 111g ok 120259084288 bytes 112g ok 121332826112 bytes 113g ok 122406567936 bytes 114g ok 123480309760 bytes 115g ok 124554051584 bytes 116g ok 125627793408 bytes Traceback (most recent call last): File "hog.py", line 25, in <module> s += x MemoryError real 1m10.509s user 0m16.184s sys 0m54.320s
memory error after 116GB.
[ec2-user@ip-10-0-0-168 ~]$ python --version Python 2.7.12 [ec2-user@ip-10-0-0-168 ~]$ free -m total used free shared buffers cached Mem: 122953 430 122522 0 11 113 -/+ buffers/cache: 304 122648 Swap: 0 0 0
Tested on EC2 r3.4xlarge instance running 64-bit Amazon Linux AMI 2016.09
Short answer would be: if you have over 100GB of RAM, one Python string can use up that much memory.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With