Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl: string length limitations in real life

Tags:

string

perl

limit

While, for example, perldata documents that scalar strings in Perl are limited only by available memory, I'm strongly suspecting in real life there would be some other limits.

I'm considering the following ideas:

  • I'm not sure how strings are implemented in Perl — is there some sort of byte/character counter? If there is, then probably it's implemented as a platform-dependent integer (i.e. 32-bit or 64-bit), so effectively it would limit strings to something like 2 ** 31, 2 ** 32, 2 ** 63 or 2 ** 64 bytes.
  • If Perl doesn't use a counter and instead uses some byte to terminate the string (which would be strange, as it's perfectly ok to have a string like "foo\0bar" in Perl), then all operations would inevitably get much slower as string length increases.
  • Most string functions that Perl deals with strings, such as length, for example, return normal scalar integer, and I strongly suspect that it would be platform-limited integer too.

So, what would be the other factors that limit Perl string length in real life? What should be considered an okay string length for practical purposes?

like image 695
GreyCat Avatar asked Apr 02 '14 17:04

GreyCat


People also ask

Is there a limit to string length?

So, we can have a String with the length of 2,147,483,647 characters, theoretically. Let's find the maximum length of the string through a Java program.

How do I get the length of a string in Perl?

length() function in Perl finds length (number of characters) of a given string, or $_ if not specified. Return: Returns the size of the string.

What is the maximum length of a string in Javascript?

The language specification requires strings to have a maximum length of 253 - 1 elements, which is the upper limit for precise integers.


1 Answers

It keep track of the size of the buffer and the number of bytes therein.

$ perl -MDevel::Peek -e'$x="abcdefghij"; Dump($x);'
SV = PV(0x9222b00) at 0x9222678
  REFCNT = 1
  FLAGS = (POK,pPOK)
  PV = 0x9238220 "abcdefghij"\0
  CUR = 10                        <-- 10 bytes used
  LEN = 12                        <-- 12 bytes allocated
  • On a 32-bit build of Perl, it uses 32-bit unsigned integer for these values. This is (exactly) large enough to create a string that uses up your process's entire 4 GiB address space.

  • On a 64-bit build of Perl, it uses 64-bit unsigned integer for those values. This is (exactly) large enough to create a string that uses up your process's entire 16 EiB address space.

The docs are correct. The size of the string is limited only by available memory.

like image 199
ikegami Avatar answered Sep 17 '22 22:09

ikegami