The Writing R Extensions manual states:
The data subdirectory is for data files, either to be made available via lazy-loading or for loading using data(). (The choice is made by the ‘LazyData’ field in the DESCRIPTION file: the default is not to do so.) It should not be used for other data files needed by the package, and the convention has grown up to use directory inst/extdata for such files.)
But it is still not clear what data is "required" by a package. I would like to use data for the following (not always mutually exclusive) reasons:
But it is not clear which of these should go in the data
folder, and which should go in inst/extdata
. And are there any conditions under which "data" should go elsewhere?
Related questions: Previous questions (e.g. inst and extdata folders in R Packaging and Using inst/extdata with vignette during package checking R 2.14.0) give some instructions on use, but don't tell me how to decide which directory to use. Another question, R - where should I place RDA file - /R, /data, /inst/extdata?, gets the closest, but seems to focus specifically on RDA and RData files.
The data
directory supplies data for the data()
function and is expected to follow certain customs in terms of file formats and extensions.
The inst/extdata
directory becomes extdata/
when installed and is more of a wild west and you can do whatever you want and it is expected that you write your own accessors.
It may be useful to look at empirics. On my machine, among around 240-some installed packages, a full 77 (or not quite a third) have data/
, but only 4 (including one of mine) have extdata.
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With