Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading large RDS files in R in a faster way

Tags:

I have a large RDS file to read in R. However, it takes quite some time to read the file.

Is there a way to speed up the reading ? I tried data.table library with its fread function, but I get an error.

data <- readRDS("myData.rds")  data <- fread("myData.rds")  # error 
like image 430
Mefhisto1 Avatar asked Jun 21 '14 07:06

Mefhisto1


People also ask

How do I read RDS data in R?

R has its own data file format–it's usually saved using the . rds extension. To read a R data file, invoke the readRDS() function. As with a CSV file, you can load a RDS file straight from a website, however, you must first run the file through a decompressor before attempting to load it via readRDS .

Are RDS files compressed?

RDS No CompressionsaveRDS has an argument “compress” that defaults to TRUE . Not compressing the files results in a bigger file size, but quicker read and write times. RDS files must be read entirely in memory so the “Read & Filter” and “Read & Group & Summarize” times will be driven by the “Read” timing.

How do I export an RDS file in R?

To save data as an RData object, use the save function. To save data as a RDS object, use the saveRDS function. In each case, the first argument should be the name of the R object you wish to save. You should then include a file argument that has the file name or file path you want to save the data set to.

What are RDS files in R?

R also has two native data formats—Rdata (sometimes shortened to Rda) and Rds. These formats are used when R objects are saved for later use. Rdata is used to save multiple R objects, while Rds is used to save a single R object. See below for instructions on how to read and load data into R from both file extensions.


1 Answers

One way to fasten the read operations of large files is to read it in a compressed mode

system.time(read.table("bigdata.txt", sep=","))  user: 170.901 system: 1.996 elapsed: 192.137 

Now trying the same reading but with a compressed file

system.time(read.table("bigdata-compressed.txt.gz", sep=","))  user: 65.511 system: 0.937 elapsed: 66.198 
like image 98
hshihab Avatar answered Oct 11 '22 06:10

hshihab