I have been reading about how read.table is not efficient for large data files. Also how R is not suited for large data sets. So I was wondering where I can find what the practical limits are and any performance charts for (1) Reading in data of various sizes (2) working with data of varying sizes.
In effect, I want to know when the performance deteriorates and when I hit a road block. Also any comparison against C++/MATLAB or other languages would be really helpful. finally if there is any special performance comparison for Rcpp and RInside, that would be great!
Under most 64-bit versions of Windows the limit for a 32-bit build of R is 4Gb: for the oldest ones it is 2Gb. The limit for a 64-bit build of R (imposed by the OS) is 8Tb.
The number is 2^31 - 1. This is the maximum number of rows for a data.
To specify a logical expression for the rows parameter, use the standard R operators. If subsetting is done by only rows or only columns, then leave the other value blank. For example, to subset the d data frame only by rows, the general form reduces to d[rows,] . Similarly, to subset only by columns, d[,cols] .
transform() function in R Language is used to modify data. It converts the first argument to the data frame. This function is used to transform/modify the data frame in a quick and easy way.
R is suited for large data sets, but you may have to change your way of working somewhat from what the introductory textbooks teach you. I did a post on Big Data for R which crunches a 30 GB data set and which you may find useful for inspiration.
The usual sources for information to get started are High-Performance Computing Task View and the R-SIG HPC mailing list at R-SIG HPC.
The main limit you have to work around is a historic limit on the length of a vector to 2^31-1 elements which wouldn't be so bad if R did not store matrices as vectors. (The limit is for compatibility with some BLAS libraries.)
We regularly analyse telco call data records and marketing databases with multi-million customers using R, so would be happy to talk more if you are interested.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With