"Back engineering" an R package from compiled binary version

Tags:

r

I work for an org that has a number of internal packages that were created many years ago. These are in the form of package zip archives that were compiled on Windows on R 3.x. Therefore, they can't be installed on R 4.x, and can't be used on Macs or Linux either without being recompiled. So everyone in the entire org is stuck on R 3.6 until this is resolved. I don't have access to the original package source files. They are lost to time....

I want to take these packages, extract the code and data, and update them for modern best practices (roxygen, GitHub repos, testthat etc.). What is the best way of doing this? I have a fair amount of experience with package development. I have already tackled one. I started a new RStudio package project, and going function by function, copying the function code to a new script file, getting and reformatting the help from the help browser as roxygen docs. I've done the same for any internal hidden functions that i could find (via pkg_name::: mostly) , and also the internal datasets. That is all fairly straightforward, but very time consuming. It builds ok, but I haven't yet tested the actual functionality of the code.

I'm currently stuck because there are a couple of standardGeneric method functions for custom S4 class objects. I am completely unfamiliar with these and haven't been able to figure out how to copy them over. Viewing the source code they are wrapped in new() with "standardGeneric" as the first argument (plus a lot more obviously), as opposed to just being a simple function definition for all the other functions. Any help with how to recreate or copy these over would be very welcome.

But maybe I am going about this the wrong way in the first place. I haven't been able to find any helpful suggestions about how to "back engineer" R package source files from a compiled version.

Anyone any ideas?

228

asked Nov 11 '21 15:11

hokeybot

Video Answer

1 Answers

Check out if this works in R 3.6.

Below script can automate least part of your problem by writing all function sources into separate and appropriately named .R files. This code will also take care of hidden functions.

Extracting code

# Use your package name
package_name <- "dplyr" 

# Extract all method names, including hidden
nms <- paste(lsf.str(paste0("package:", package_name), all.names = TRUE))

# Loop through the method names,
# extract head and body, and write them to R files
for (i in 1:length(nms)) {

    # Extract name
    nm <- nms[i]

    # Extract head
    hd_raw <- capture.output(args(nms[i]))
    # Collapse raw output, but drop trailing NULL
    hd <- paste0(hd_raw[-length(hd_raw)], collapse = "\n")

    # Extract body, collapse
    bd <- paste0(capture.output(body(nms[i])), collapse = "\n")
    
    # Write all to file
    write(paste0(hd, bd), file = paste0(nm, ".R"))
}

Extracting help files

To extract a functions's help text a similar way, you can use code from the following SO answers:

for plain text: Get the documentation of an R function from the help as a string
for .Rd file contents: How to access the help/documentation .rd source files in R?

A starting point could be something like:

library(tools)
package_name <- "dplyr" 
db <- Rd_db(package_name)

# Extract all method names, including hidden
nms <- paste(lsf.str(paste0("package:", package_name), all.names = TRUE))

# Loop through the method names,
# extract Rd contents if they exist in this namespace, 
# and write them to new Rd files
for (i in 1:length(nms)) {
    
    # Extract name
    nm <- nms[i]
    
    rd_raw <- db[names(db) %in% paste0(nm, ".Rd")]
    if (length(rd_raw) > 0) {
        rd <- paste0(capture.output(rd_raw), collapse = "\n")
        # Write all to file
        write(rd, file = paste0(nm, ".Rd"))
    }
    
}

100

answered Nov 04 '22 02:11

Roman

Related questions
                            
                                How to make captions in ggplot2 more aesthetically pleasing?
                            
                                Group vector on conditional sum
                            
                                How to use plotly to return the same event_data information for selected points even after modifying the data
                            
                                Programmatically generate slides in R with xaringan and plotly
                            
                                Equation Numbering in Rmarkdown - For Export to Word
                            
                                Extract the coefficients for the best tuning parameters of a glmnet model in caret
                            
                                Installing package fails when building vignettes ((..)/doc/index.html is missing)
                            
                                Predict using felm output with standard errors
                            
                                Addressing the "cannot remove prior installation of package" error when updating multiple packages at once
                            
                                How to change citation style in biblatex in R Markdown?
                            
                                Skip rows while reading multiple excel worksheets in R
                            
                                Customizing RStudio environment in Docker container
                            
                                ggplot `facet_grid` label cut off
                            
                                Adding counts to ggmosaic, can this be done simpler?
                            
                                install R package from github using "conda"
                            
                                Adding second Y axis on ggplotly
                            
                                How do I shade plot subregion and use ggrepel to label a subset of data points?
                            
                                How to add trend line in a log-log plot (ggplot2)?
                            
                                Underscore plot in R
                            
                                Finding objects from other packages' namespaces in package code

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With