Is there a quick way to scan an R script and determine which packages are actually used? By this I mean looking at all of the functions called in the script and returning a list of packages that contain these function names? (I know that function names are not exclusive to any one package)
Why not just look at packages called by library()
or require()
? Right. Well, I have a bad habit of loading packages I often use regardless of whether I actually use them in the script.
I'd like to clean up some scripts that I intend to share with others by removing unused packages.
I resolve to change my ways in 2016. Please help me get started.
Update
Some good ideas in the comments...
# create an R file that uses a few functions
fileConn<-file("test.R")
writeLines(c("df <- data.frame(v1=c(1, 1, 1), v2=c(1, 2, 3))",
"\n",
"m <- mean(df$v2)",
"\n",
"describe(df) #psych package"),
fileConn)
close(fileConn)
# getParseData approach
pkg <- getParseData(parse("test.R"))
pkg <- pkg[pkg$token=="SYMBOL_FUNCTION_CALL",]
pkg <- pkg[!duplicated(pkg$text),]
pkgname <- pkg$text
pkgname
# [1] "data.frame" "c" "mean" "describe"
Update 2
An ugly attempt to implement @nicola's idea:
# load all probable packages first
pkgList <- list(pkgname)
for (i in 1:length(pkgname)) {
try(print(packageName(environment(get(pkgList[[1]][i])))))
}
It does not like the c()
function, but the results seem otherwise correct.
#[1] "base"
#Error in packageName(environment(get(pkgList[[1]][i]))) :
# 'env' must be an environment
#[1] "base"
#[1] "psych"
An answer based on ideas in the question comments. The key functions are getParseData()
and packageName()
.
# create an R file that uses a few functions
fileConn<-file("test.R")
writeLines(c("df <- data.frame(v1=c(1, 1, 1), v2=c(1, 2, 3))",
"\n",
"m <- mean(df$v2)",
"\n",
"describe(df) #psych package"),
fileConn)
close(fileConn)
# getParseData approach
pkg <- getParseData(parse("test.R"))
pkg <- pkg[pkg$token=="SYMBOL_FUNCTION_CALL",]
pkg <- pkg[!duplicated(pkg$text),]
pkgname <- pkg$text
pkgname
# [1] "data.frame" "c" "mean" "describe"
# load all probable packages first
pkgList <- list(pkgname)
for (i in 1:length(pkgname)) {
try(print(packageName(environment(get(pkgList[[1]][i])))))
}
#[1] "base"
#Error in packageName(environment(get(pkgList[[1]][i]))) :
# 'env' must be an environment
#[1] "base"
#[1] "psych"
I'll mark this as correct for now, but happy to consider other solutions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With