I work for an org that has a number of internal packages that were created many years ago. These are in the form of package zip archives that were compiled on Windows on R 3.x
. Therefore, they can't be installed on R 4.x
, and can't be used on Macs or Linux either without being recompiled. So everyone in the entire org is stuck on R 3.6
until this is resolved. I don't have access to the original package source files. They are lost to time....
I want to take these packages, extract the code and data, and update them for modern best practices (roxygen
, GitHub repos, testthat
etc.). What is the best way of doing this? I have a fair amount of experience with package development. I have already tackled one. I started a new RStudio package project, and going function by function, copying the function code to a new script file, getting and reformatting the help from the help browser as roxygen docs. I've done the same for any internal hidden functions that i could find (via pkg_name:::
mostly) , and also the internal datasets. That is all fairly straightforward, but very time consuming. It builds ok, but I haven't yet tested the actual functionality of the code.
I'm currently stuck because there are a couple of standardGeneric
method functions for custom S4 class objects. I am completely unfamiliar with these and haven't been able to figure out how to copy them over. Viewing the source code they are wrapped in new()
with "standardGeneric"
as the first argument (plus a lot more obviously), as opposed to just being a simple function
definition for all the other functions. Any help with how to recreate or copy these over would be very welcome.
But maybe I am going about this the wrong way in the first place. I haven't been able to find any helpful suggestions about how to "back engineer" R package source files from a compiled version.
Anyone any ideas?
Binary Packages The binary format of an R package is useful because an R user can install a binary package without compiling all of the package's source code. In some cases source packages can take hours to install. Additionally, compiling package binaries requires locating and installing system prerequisites.
Many R packages are written in R. Since R is an interpreted language, source code written in R doesn't have to be translated into system-specific machine language before running. However, some R packages have significant portions written in other, compiled languages, usually C/C++ or Fortran.
When an R version is no longer supported, RStudio Package Manager will continue to serve binary packages for that R version in perpetuity, but no longer provide new binary packages after several months. At this time, binary packages are only supported for CRAN, curated CRAN, and CRAN snapshot sources.
This file is the result of running R CMD build for that R package. Binary: A binary file specific to an operating system (OS) and architecture, containing compiled source code. Not an executable. The result of R CMD INSTALL. For more information, see Wickham's book, R Packages.
Check out if this works in R 3.6
.
Below script can automate least part of your problem by writing all function sources into separate and appropriately named .R
files. This code will also take care of hidden functions.
# Use your package name
package_name <- "dplyr"
# Extract all method names, including hidden
nms <- paste(lsf.str(paste0("package:", package_name), all.names = TRUE))
# Loop through the method names,
# extract head and body, and write them to R files
for (i in 1:length(nms)) {
# Extract name
nm <- nms[i]
# Extract head
hd_raw <- capture.output(args(nms[i]))
# Collapse raw output, but drop trailing NULL
hd <- paste0(hd_raw[-length(hd_raw)], collapse = "\n")
# Extract body, collapse
bd <- paste0(capture.output(body(nms[i])), collapse = "\n")
# Write all to file
write(paste0(hd, bd), file = paste0(nm, ".R"))
}
To extract a functions's help text a similar way, you can use code from the following SO answers:
.Rd
file contents: How to access the help/documentation .rd source files in R?
A starting point could be something like:
library(tools)
package_name <- "dplyr"
db <- Rd_db(package_name)
# Extract all method names, including hidden
nms <- paste(lsf.str(paste0("package:", package_name), all.names = TRUE))
# Loop through the method names,
# extract Rd contents if they exist in this namespace,
# and write them to new Rd files
for (i in 1:length(nms)) {
# Extract name
nm <- nms[i]
rd_raw <- db[names(db) %in% paste0(nm, ".Rd")]
if (length(rd_raw) > 0) {
rd <- paste0(capture.output(rd_raw), collapse = "\n")
# Write all to file
write(rd, file = paste0(nm, ".Rd"))
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With