Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you handle R Data internal to a package?

The R package I am developing requires several R data objects, such as pre-computed models and parameters.

Currently I have each object in the 'data' directory of the package in individual .RData files. When using the package users can use the "data" function to attach these objects to their environment.

The behaviour I would like instead would be that on loading the package the data objects are automatically attached to the internal package environment and not accessible to the user directly.

My understanding is that placing a 'sysdata.rda' file in the 'R' directory of the package containing the objects currently in 'data' will give me the desired result. However, is there a way to do this so that I can have each object in a separate file instead of grouped together?

like image 539
Nixuz Avatar asked Mar 01 '12 17:03

Nixuz


People also ask

How do you load data into an R package?

The default R datasets included in the base R distribution Simply check the checkbox next to the package name to load the package and gain access to the datasets. You can also click on the package name and RStudio will open a help file describing the datasets in this package.

Where does R package store data?

If you want to store parsed data, but not make it available to the user, put it in R/sysdata. rda . This is the best place to put data that your functions need. If you want to store raw data, put it in inst/extdata .

What is an internal function in R?

What is an internal function? It's a function that lives in your package, but that isn't surfaced to the user. You could also call it unexported function or helper function; as opposed to exported functions and user-facing functions.

How do I see data in an R package?

To get the list of available data sets in base R we can use data() but to get the list of data sets available in a package we first need to load that package then data() command shows the available data sets in that package. Also, for data sets in base R, we can use ls("package:datasets").


1 Answers

Put your sysdata.rda file in the data directory of your package.

Do not use Lazy Data -- your DESCRIPTION file should either not have a line for LazyData, or, if it does, it should be LazyData: no

In any .R file in the R directory of your package add a line like this

data(sysdata, envir=environment()) 

I created a data.frame named sysdata and saved it to a file called sysdata.rda in the data directory of a package called anRpackage

I added the above line to an .R file, and also added this unexported function just to show that functions in the package have access to the data.

foo <- function() tail(sysdata, 2) 

Then I see the following an R session

> library(anRpackage) > sysdata Error: object 'sysdata' not found  > anRpackage:::sysdata   A  B C 1 1  6 a 2 2  7 b 3 3  8 c 4 4  9 d 5 5 10 e  > anRpackage:::foo()   A  B C 4 4  9 d 5 5 10 e 

So, users still have access to the data, but as you requested, they do not have direct access. The user still has the option to run data(sysdata).

like image 155
GSee Avatar answered Sep 30 '22 14:09

GSee