Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Max Length for a Vector in R

Tags:

r

According to the R 'Memory-limits' documentation, it isn't possible to allocate a vector of length longer than 2^31-1. This is because the integer used as an index can only use 31 bits (one bit for the sign). But on a 64-bit system, I should be able to allocate longer vectors. Why does R impose this same max length on 64-bit systems? Is there a way to circumvent the limit?

like image 668
user1401630 Avatar asked May 17 '12 17:05

user1401630


People also ask

What is the length of a vector in R?

To get length of a vector in R programming, call length() function and pass the vector to it. length() function returns an integer, representing the length of vector.

Does R have a character limit?

One of the defined RVAR variable names exceeds the maximum allowable length (250). A variable has a name that exceeds 250 characters. This is often caused by making manual changes to the R script header block. This can also be caused by a variable in your R code that exceeds 250 characters for its name.

How do you find maxima and minima in R?

In R, we can find the minimum or maximum value of a vector or data frame. We use the min() and max() function to find minimum and maximum value respectively. The min() function returns the minimum value of a vector or data frame. The max() function returns the maximum value of a vector or data frame.


2 Answers

If you're willing to work with the development version of R, you can have experimental support for this feature. From http://stat.ethz.ch/R-manual/R-devel/doc/html/NEWS.html :

LONG VECTORS

There are the beginnings of support for vectors longer than 2^31 - 1 elements on 64-bit platforms. This applies to raw, logical, integer, double, complex and character vectors, as well as lists. (Elements of character vectors remain limited to 2^31 - 1 bytes.)

All aspects are currently experimental.

What can be done with such vectors is currently somewhat limited, and most operations will return the error ‘long vectors not supported yet’. They can be serialized and unserialized, coercion, identical() and object.size() work and means can be computed. Their lengths can be get and set by xlength(): calling length() on a long vector will throw an error.

Most aspects of indexing are available. Generally double-valued indices can be used to access elements beyond 2^31 - 1.

See the link for more details. I haven't experimented with this at all myself, so I can't comment on whether it is practically useful yet or not.

If you go to http://developer.r-project.org/R_svnlog_2011 (and http://developer.r-project.org/R_svnlog_2012) and search for "long vectors" you can get a sense of the work that is going on.

like image 124
Ben Bolker Avatar answered Sep 23 '22 21:09

Ben Bolker


Here are some more details that will complement Ben's answer. The limitations seem to be inherited from the lower level programming languages used to build R, especially (apparently) the FORTRAN code. So, obviously, transitioning R so that it can take full advantage of 64-bit addressing systems is going to be a major project.

From the R-admin manual:

Even on 64-bit builds of R there are limits on the size of R objects (see help("Memory-limits"), some of which stem from the use of 32-bit integers (especially in FORTRAN code). On all builds of R, the maximum length (number of elements) of a vector is 2^31-1, about 2 billion, and on 64-bit builds the size of a block of memory allocated is limited to 2^34-1 bytes (8GB). It is anticipated these will be raised eventually* but the need for 8GB objects is (when this was written in 2011) exceptional.

(There's also a wry footnote in the manual, where I've put a *, noting that "this comment has been in the manual since 2005". :)

like image 35
Josh O'Brien Avatar answered Sep 19 '22 21:09

Josh O'Brien