Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

inst and extdata folders in R Packaging

Tags:

package

r

In the documentation, R suggests that raw data files (not Rdata nor Rda) should be placed in inst/extdata/

From the first paragraph in: http://cran.r-project.org/doc/manuals/R-exts.html#Data-in-packages

The data subdirectory is for data files, either to be made available via lazy-loading or for loading using data(). (The choice is made by the ‘LazyData’ field in the DESCRIPTION file: the default is not to do so.) It should not be used for other data files needed by the package, and the convention has grown up to use directory inst/extdata for such files.

So, I have moved all of my raw data into this folder, but when I build and reload the package and then try to access the data in a function with (for example):

read.csv(file=paste(path.package("my_package"),"/inst/extdata/my_raw_data.csv",sep="")) 
# .path.package is now path.package in R 3.0+

I get the "cannot open file" error.

However, it does look like there is a folder called /extdata in the package directory with the files in it (post-build and install). What's happening to the /inst folder?

Does everything in the /inst folder get pushed into the / of the package?

like image 506
Brandon Bertelsen Avatar asked Nov 19 '12 22:11

Brandon Bertelsen


People also ask

How do I refer to files in Inst/extdata?

When the package is installed, all files (and folders) in inst/ are moved up one level to the top-level directory (so they can’t have names like R/ or DESCRIPTION ). To refer to files in inst/extdata (whether installed or not), use system.file (). For example, the readr package uses inst/extdata to store delimited files for use in examples:

Where is package data stored in usethis?

The most common location for package data is (surprise!) data/. Each file in this directory should be a .RData file created by save () containing a single object (with the same name as the file). The easiest way to adhere to these rules is to use usethis::use_data ():

Where do I put the raw data files in a package?

They’re only available inside your package. If you want to show examples of loading/parsing raw data, put the original files in inst/extdata. When the package is installed, all files (and folders) in inst/ are moved up one level to the top-level directory (so they can’t have names like R/ or DESCRIPTION ).

What is the best way to store data in R?

If you want to store parsed data, but not make it available to the user, put it in R/sysdata.rda. This is the best place to put data that your functions need. If you want to store raw data, put it in inst/extdata.


2 Answers

More useful than using file.path would be to use system.file. Once your package is installed, you can grab your file like so:

fpath <- system.file("extdata", "my_raw_data.csv", package="my_package")

fpath will now have the absolute path on your HD to the file.

like image 122
Steve Lianoglou Avatar answered Oct 02 '22 16:10

Steve Lianoglou


You were both very close and essentially had this. A formal reference from 'Writing R Extensions' is:

1.1.3 Package subdirectories

[...]

The contents of the inst subdirectory will be copied recursively to the installation directory. Subdirectories of inst should not interfere with those used by R (currently, R, data, demo, exec, libs, man, help, html and Meta, and earlier versions used latex, R-ex). The copying of the inst happens after src is built so its Makefile can create files to be installed. Prior to R 2.12.2, the files were installed on POSIX platforms with the permissions in the package sources, so care should be taken to ensure these are not too restrictive: R CMD build will make suitable adjustments. To exclude files from being installed, one can specify a list of exclude patterns in file .Rinstignore in the top-level source directory. These patterns should be Perl-like regular expressions (see the help for regexp in R for the precise details), one per line, to be matched(10) against the file and directory paths, e.g. doc/.*[.]png$ will exclude all PNG files in inst/doc based on the (lower-case) extension.

like image 39
Dirk Eddelbuettel Avatar answered Oct 02 '22 16:10

Dirk Eddelbuettel