I am using Rstudio to streamline Sweave and R for data analyses that I will share with other analysts. In order to make the coding of variables crystal clear, it would be great to have something like a help file so they can call ?myData
and get a helpful file, if they need. I like the Rd markdown and think it actually has great potential to document analytic datasets, including an overall summary, a variable by variable breakdown, and an example of how to run some exploratory analyses.
It's easy to do this if you're specifically creating a package, but I think that it's confusing since packages are ultimately a collection of functions and they don't integrate Rnw files.
Can I use Roxygen2 to create help files for datasets that aren't a part of any package?
Creating Rd FilesUse the File -> New -> R Documentation command in RStudio. This command will allow you to specify the name of an existing function or dataset to use as the basis for the Rd file or alternatively will create a new empty Rd file.
R objects are documented in files written in “R documentation” (Rd) format, a simple markup language much of which closely resembles (La)TeX, which can be processed into a variety of formats, including LaTeX, HTML and plain text.
The goal of roxygen2 is to make documenting your code as easy as possible. R provides a standard way of documenting packages: you write . Rd files in the man/ directory. These files use a custom syntax, loosely based on LaTeX. roxygen2 provides a number of advantages over writing .
Base R provides a standard way of documenting a package where each documentation topic corresponds to an . Rd file in the man/ directory. These files use a custom syntax, loosely based on LaTeX, that are rendered to HTML, plain text, or pdf, as needed, for viewing.
Before I take a crack at this, I would like to reiterate what others are saying. R's package system is literally exactly what you are looking for. It is used successfully by many to distribute just data and no code. Combined with R's lazyloading of data, you can distribute large datasets as packages and not burden users who don't wish to load it all.
In addition, you will not be able to take advantage of R's help system unless you use packages. The original question explicitly asks about using ?myData
and your users will not be able to do that if you do not use a package. This is quite simply a limitation of R's base help function.
Now, to answer the question. You will need to use some non-exported roxygen functions to make this work, but it's not too onerous. In addition, you'll need to put your R file(s) documenting your data into a folder of their own somewhere, and within that folder you will want to create an empty folder called man
.
Example directory structure:
# ./
# ./man/
# ./myData.R
# ./otherData.R
myData.R
#' My dataset
#'
#' This is data I like.
#'
#' @name myData
NULL
otherData.R:
#' My other dataset
#'
#' This is another dataset I like
#'
#' @name otherData
NULL
Now, the code that will bring it all together (and you can of course wrap this in a function):
library(roxygen2)
mydir <- "path/to/your/data/directory/"
myfiles <- c("myData.R","otherData.R")
# get parsed source into roxygen-friendly format
env <- new.env(parent = globalenv())
rfiles <- sapply(myfiles, function(f) file.path(mydir,f))
blocks <- unlist(lapply(rfiles, roxygen2:::parse_file, env=env), recursive=FALSE)
parsed <- list(env=env, blocks=blocks)
# parse roxygen comments into rd files and output then into the "./man" directory
roc <- roxygen2:::rd_roclet()
results <- roxygen2:::roc_process(roc, parsed, mydir)
roxygen2:::roc_output(roc, results, mydir, options=list(wrap=FALSE), check = FALSE)
You should now have properly formatted myData.Rd
and otherData.Rd
files in the once-empty man
folder.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With