I'm refactoring a package that imports many other packages' full namespaces. I believe that many of these dependencies are used for single function call uses that would be better handled using importFrom, or are orphaned dependencies that are no longer used.
There's enough code in the package that it would be tedious to manually examine every line looking for unfamiliar function calls.
How can I determine where and how many times objects from imported namespaces are being used in the package? Please note that this package does not include unit tests.
Here is a reproducible example:
DESCRIPTION
file:
Package: my_package
Title: title
Version: 0.0.1
Authors@R: person(
given = "A",
family = "Person",
role = c("aut", "cre"),
email = "[email protected]"
)
Description: Something
License: Some license
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.1.1
Imports:
dplyr,
purrr,
stringr
NAMESPACE
file:
import(dplyr)
import(purrr)
import(stringr)
my_package.R
file:
#' my_package
#' @docType package
#' @name my_package
NULL
#' @import dplyr
#' @import purrr
#' @import stringr
NULL
functions.R
file
#' add 1 to "banana" column and call it "apple"
#' @description demonstrate a variety of dplyr functions
#' @param x a data.frame object
#' @return a data.frame object with columns "apple" and "banana"
#' @examples
#' my_fruit <- data.frame(banana = c(1,2,3), pear = c(4,5,6))
#' my_function(my_fruit)
#' @export
my_function <- function(x) {
x %>%
mutate(apple = banana + 1) %>%
select(apple, banana)
}
I am looking for a solution that would identifies that %>%
, mutate
and select
are exports from dplyr
, %>%
is an export from purrr
, and there are no used exports from the attached namespace stringr
. In the case of functions like %>%
exported from multiple namespaces it's not that important to me to distinguish which namespace the export is coming from (in the example both %>%
are rexports from the magrittr
dependency) since where actual masking occurs a warning is generated when the package gets loaded.
Here's a base solution
pkgs <- readLines("NAMESPACE")
pattern <- "^import\\((.*?)\\)$"
pkgs <- pkgs[grepl(pattern, pkgs)]
pkgs <- sub(pattern, "\\1", pkgs)
pkgs
#> [1] "dplyr" "purrr" "stringr"
exports <- sapply(pkgs, getNamespaceExports)
exports <- do.call(rbind, Map(data.frame, package = pkgs, fun = exports))
rownames(exports) <- NULL
head(exports)
#> package fun
#> 1 dplyr rows_upsert
#> 2 dplyr src_local
#> 3 dplyr db_analyze
#> 4 dplyr n_groups
#> 5 dplyr distinct
#> 6 dplyr summarise_
code <- sapply(list.files("R", full.names = TRUE), parse)
funs <- sapply(code, function(x) setdiff(all.names(x), all.vars(x)))
funs <- funs[lengths(funs) > 0]
funs <- do.call(rbind, Map(data.frame, fun = funs, file = names(funs)))
rownames(funs) <- NULL
funs
#> fun file
#> 1 <- R/functions.R
#> 2 function R/functions.R
#> 3 { R/functions.R
#> 4 %>% R/functions.R
#> 5 mutate R/functions.R
#> 6 + R/functions.R
#> 7 select R/functions.R
final output :
merge(exports, funs)
#> fun package file
#> 1 %>% stringr R/functions.R
#> 2 %>% purrr R/functions.R
#> 3 %>% dplyr R/functions.R
#> 4 mutate dplyr R/functions.R
#> 5 select dplyr R/functions.R
It is not 100% robust as for instance a function function(x) {select<-identity; select(x)}
will show select as being taken from {dplyr}.
It will also miss functions that are not used in fun()
form, as in lapply(my_list, fun)
.
We can't really detect those robustly, a way around, that might get us there or at least closer if we have 100% test coverage, is to curry those imported functions so they tell us when they're called, then run the tests.
You probably don't need this though.
You could use getParsedData
to get all function calls used in the package, and join them with available functions in NAMESPACE
to find out their origin.
Tested on reproducible example my_package
:
library(dplyr)
library(purrr)
library(stringr)
# List functions used in Package
path <- "./my_package"
files <- file.path(path,list.files(path= path, recursive = TRUE, pattern ='\\.R$'))
functions <- files %>% map_dfr(~{
getParseData(parse(.x, keep.source=TRUE)) %>%
filter(token %in% c("SYMBOL_FUNCTION_CALL","SPECIAL")) %>%
mutate(file = .x) %>%
rename(fctname = text) %>%
select(file, fctname) %>% unique })
# List of all possible functions imports
imports <- readLines(file.path(path,"NAMESPACE"))
imports <- str_match(imports, "import\\(\\s*(.*?)\\s*\\)")[,2]
imports <- imports[!is.na(imports)]
possible.imported.functions <- imports %>% map_dfr(~{
data.frame(package.import = .x,fctname = getNamespaceExports(.x)) })
# Imported functions in use
inner_join(functions,possible.imported.functions, by = c('fctname'='fctname')) %>%
arrange(package.import,fctname) %>%
select(file,package.import,fctname)
#> file package.import fctname
#> 1 my_package/R/functions.R dplyr %>%
#> 2 my_package/R/functions.R dplyr mutate
#> 3 my_package/R/functions.R dplyr select
#> 4 my_package/R/functions.R purrr %>%
#> 5 my_package/R/functions.R stringr %>%
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With