Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compress saves in R package build

I'm trying to include a (somewhat) large dataset in an R package. I keep getting the Warning during the check in Rstudio saying that I could save space with compression:

* checking data for ASCII and uncompressed saves ... WARNING

  Note: significantly better compression could be obtained
        by using R CMD build --resave-data
          old_size new_size compress
  slp.rda    499Kb    310Kb    bzip2
  sst.rda    1.3Mb    977Kb       xz

I've tried adding -- resave-data to RStudio's "Configure Buid Tools" to no effect.

enter image description here

like image 994
Marc in the box Avatar asked Sep 16 '15 10:09

Marc in the box


2 Answers

The devtools function use_data takes a parameter for the type of compression and makes adding data to pkgs much easier in general. Using it, or just save on your own), use xz compression when you save your data (for save it's the compression_level parameter).

If you want to use --resave-data then you can try --resave-data=best since just using --resave-data defaults to gzip (gaining you pretty much nothing in this case).

See Building package tarballs for more information.

like image 108
hrbrmstr Avatar answered Oct 10 '22 00:10

hrbrmstr


Another alternative, if you have a large dataset that you don't want to re-create, is to use tools::resaveRdaFiles from within R. Point it at the dataset file, or the entire data directory, and it will compress your data in a format of your choosing. See its manual page for more information.

like image 26
Martin Smith Avatar answered Oct 10 '22 01:10

Martin Smith