Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TCL max size of array

I'm working on an engineering application, and the interface is written in TCL TK.

Everything went fine until I need to use a (extremely) large array. 370.000.000 of elements, each element from 2 to 10 characters length (linear grown).

My question is, ¿where is the size limit for TCL arrays? I've been reading and investigating and the only I've found is "2GB" of string data, but I dont know if it's reliable because it doesn't explain the reason.

I did an experiment:

set lista [list ]
catch {
    for {set i 0} {$i < 370000000} {incr i} {
        lappend lista $i
    }
}
puts $i

returns $i = 50.000.000 more or less on a 32 bits Windows 7

like image 355
JGInternational Avatar asked Jul 09 '15 10:07

JGInternational


People also ask

Is there a limit to array size?

The maximum allowable array size is 65,536 bytes (64K). Reduce the array size to 65,536 bytes or less. The size is calculated as (number of elements) * (size of each element in bytes).

What is array Set in Tcl?

An array is a systematic arrangement of a group of elements using indices. The syntax for the conventional array is shown below. set ArrayName(Index) value. An example for creating simple array is shown below. #!/usr/bin/tclsh set languages(0) Tcl set languages(1) "C Language" puts $languages(0) puts $languages(1)

Can't set variable is array Tcl?

In Tcl, $varName means “read from the variable called varName ” and is not a general reference to the variable (unlike some other languages, notably Perl and PHP, which do rather different things). Reading from a whole array, instead of an element of that array, is always an error in Tcl.

What is lindex in Tcl?

lindex , a built-in Tcl command, retrieves an element from a list or a nested list.


1 Answers

It's a bit complicated to explain. The 2GB limit comes from the low-level memory allocator, which has a size limit because it uses a signed 32-bit integer to describe how much memory to allocate. That was fine on 32-bit systems, but it's an open bug (which might be assigned to me) that it's still true on 64-bit systems; the right type in the C API is actually ssize_t (yeah, still signed; negative values are used for signalling) but fixing it completely wrecks a lot of API, so it requires a major version change to sort out.

But the maximum size of a list is something else. That is fundamentally linked to a combination of a few things. Firstly, there's the maximum size of memory structure that can be allocated (the 2GB limit) which means that you probably can't reliably get more than 256M elements in a list on a 64-bit system. Then there's the total number of items allocated, though that's less of a problem in practice, particularly if you actually put items in the list multiple times (as they share references). Finally, there's the size of the string representation of the list: if you're generating that a lot, you're doing it wrong anyway, but that would be the real limiting factor in your example if you were creating it (as that will hit the 2GB limit sooner).

The actual point where you hit the memory limit might be lower, depending on when your system starts to deny requests to allocate memory. That's all up to the OS, which tends to base its decision on what else is going on on the system, so it's incredibly hard to give any kind of general rule there. My (64-bit, OSX) system took ages, but succeeded in running your sample code:

$ tclsh8.6
% eval {
set lista [list ]
catch {
    for {set i 0} {$i < 370000000} {incr i} {
        lappend lista $i
    }
}
puts $i
}
370000000
% llength $lista
370000000
% unset lista
% exit

The llength was the only truly quick operation (since it could pull the length out of the list metadata). The unset took ages. The exit was pretty quick, but took a few seconds.

like image 124
Donal Fellows Avatar answered Oct 02 '22 17:10

Donal Fellows