Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Offline installation of a list of packages: getting dependencies in order

I've got the source files for a bunch of packages and their dependencies that I want to install on computers that have no internet access. I want to install all of these on other computers using as USB stick, but the install fails for some packages because the dependencies are not installing before the packages. How can I get the dependencies to be installed in order, before the packages that needs them?

Here's my current method to obtain the packages, their dependencies, and get them in the correct order:

# find the dependencies for the packages I want
# from http://stackoverflow.com/a/15650828/1036500
getPackages <- function(packs){
  packages <- unlist(
    tools::package_dependencies(packs, available.packages(),
                                which=c("Depends", "Imports"), recursive=TRUE)
  )
  packages <- union(packages, packs)
  packages
}

# packages I want 
my_packages <- c('stringr', 'devtools', 'ggplot2', 'dplyr', 'tidyr', 'rmarkdown', 'knitr', 'reshape2', 'gdata')

# get names of dependencies and try to get them in the right order, this seems ridiculous... 
my_packages_and_dependencies <- getPackages(my_packages)
dependencies_only <- setdiff(my_packages_and_dependencies, my_packages)
deps_of_deps <- getPackages(dependencies_only)
deps_of_deps_of_deps <- getPackages(deps_of_deps)
my_packages_and_dependencies <- unique(c(deps_of_deps_of_deps, deps_of_deps, dependencies_only, my_packages))

# where to keep the source?
local_CRAN <- paste0(getwd(), "/local_CRAN")

# get them from CRAN, source files
download.packages(pkgs = my_packages_and_dependencies, destdir = local_CRAN, type = "source")
# note that 'tools', 'methods', 'utils, 'stats', etc. art not on CRAN, but are part of base

# from http://stackoverflow.com/a/10841614/1036500
library(tools)
write_PACKAGES(local_CRAN)

Now assume I'm on another computer with a fresh install of R and RStudio (and Rtools or Xcode) and no internet connection, I plug in the USB stick, open the RProj file to set the working directory, and run this script:

#############################################################

## Install from source (Windows/OSX/Linux)

# What do I want to install?
my_packages_and_dependencies <- c("methods", "tools", "bitops", "stats", "colorspace", "graphics", 
                                  "tcltk", "Rcpp", "digest", "jsonlite", "mime", "RCurl", "R6", 
                                  "stringr", "brew", "grid", "RColorBrewer", "dichromat", "munsell", 
                                  "plyr", "labeling", "grDevices", "utils", "httr", "memoise", 
                                  "whisker", "evaluate", "rstudioapi", "roxygen2", "gtable", "scales", 
                                  "proto", "MASS", "assertthat", "magrittr", "lazyeval", "DBI", 
                                  "stringi", "yaml", "htmltools", "caTools", "formatR", "highr", 
                                  "markdown", "gtools", "devtools", "ggplot2", "dplyr", "tidyr", 
                                  "rmarkdown", "knitr", "reshape2", "gdata")

# where are the source files? 
local_CRAN <- paste0(getwd(), "/local_CRAN")

# scan all packages and get files names of wanted source pckgs
# I've got other things in this dir also
wanted_package_source_filenames <- list.files(local_CRAN, pattern = "tar.gz", full.names = TRUE)

# put them in order to make sure deps go first, room for improvement here...
trims <- c(local_CRAN, "/",  "tar.gz")
x1 <- gsub(paste(trims, collapse = "|"), "", wanted_package_source_filenames)
x2 <- sapply( strsplit(x1, "_"), "[[", 1)
idx <- match(my_packages_and_dependencies, x2)
wanted_package_source_filenames <- na.omit(wanted_package_source_filenames[idx])

install.packages(wanted_package_source_filenames, 
                 repos = NULL, 
                 dependencies = TRUE, 
                 contrib.url = local_CRAN, # I thought this would take care of getting dependencies automatically...
                 type  = "source" )

This works reasonably well, but still some packages fail to install:

sapply(my_packages_and_dependencies, require, character.only = TRUE) 

 methods        tools       bitops        stats 
        TRUE         TRUE         TRUE         TRUE 
  colorspace     graphics        tcltk         Rcpp 
        TRUE         TRUE         TRUE         TRUE 
      digest     jsonlite         mime        RCurl 
        TRUE         TRUE         TRUE        FALSE 
          R6      stringr         brew         grid 
        TRUE         TRUE         TRUE         TRUE 
RColorBrewer    dichromat      munsell         plyr 
        TRUE         TRUE         TRUE         TRUE 
    labeling    grDevices        utils         httr 
        TRUE         TRUE         TRUE        FALSE 
     memoise      whisker     evaluate   rstudioapi 
        TRUE         TRUE         TRUE         TRUE 
    roxygen2       gtable       scales        proto 
        TRUE         TRUE         TRUE         TRUE 
        MASS   assertthat     magrittr     lazyeval 
        TRUE         TRUE         TRUE         TRUE 
         DBI      stringi         yaml    htmltools 
        TRUE         TRUE         TRUE         TRUE 
     caTools      formatR        highr     markdown 
        TRUE         TRUE         TRUE         TRUE 
      gtools     devtools      ggplot2        dplyr 
        TRUE        FALSE        FALSE         TRUE 
       tidyr    rmarkdown        knitr     reshape2 
       FALSE        FALSE         TRUE         TRUE 
       gdata 
        TRUE 
Warning messages:
1: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘RCurl’
2: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘httr’
3: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘devtools’
4: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘ggplot2’
5: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘tidyr’
6: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE,  :
  there is no package called ‘rmarkdown’

Seems that knitr must come before rmarkdown, reshape2 before tidyr and ggplot2, etc. etc.

There must be a simpler and more complete solution to the problem of getting the list of source files in the very specific order needed the put all the dependencies in the right order. What's the simplest way to do that (without using any contributed packages)?

This is the system I am currently working on, I'm using the source versions of packages in an attempt to prepare for anything with the offline computers (OSX/Linux/Windows):

> sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
 [1] tcltk     grid      tools     stats     graphics 
 [6] grDevices utils     datasets  methods   base     

other attached packages:
 [1] gdata_2.13.3       reshape2_1.4.1    
 [3] knitr_1.9          dplyr_0.4.1       
 [5] gtools_3.4.1       markdown_0.7.4    
 [7] highr_0.4          formatR_1.0       
 [9] caTools_1.17.1     htmltools_0.2.6   
[11] yaml_2.1.13        stringi_0.4-1     
[13] DBI_0.3.1          lazyeval_0.1.10   
[15] magrittr_1.5       assertthat_0.1    
[17] proto_0.3-10       scales_0.2.4      
[19] gtable_0.1.2       roxygen2_4.1.0    
[21] rstudioapi_0.2     evaluate_0.5.5    
[23] whisker_0.3-2      memoise_0.2.1     
[25] labeling_0.3       plyr_1.8.1        
[27] munsell_0.4.2      dichromat_2.0-0   
[29] RColorBrewer_1.1-2 brew_1.0-6        
[31] stringr_0.6.2      R6_2.0.1          
[33] mime_0.2           jsonlite_0.9.14   
[35] digest_0.6.8       Rcpp_0.11.4       
[37] colorspace_1.2-5   bitops_1.0-6      
[39] MASS_7.3-35       

loaded via a namespace (and not attached):
[1] parallel_3.1.2

EDIT following Andrie's helpful comment, I've had a go with miniCRAN, the bit that's missing from the vignette is how to actually install the packages from the local repo. This is what I've tried:

library("miniCRAN")

# Specify list of packages to download
pkgs <- c('stringr', 'devtools', 'ggplot2', 'dplyr', 'tidyr', 'rmarkdown', 'knitr', 'reshape2', 'gdata')

# Make list of package URLs
revolution <- c(CRAN="http://cran.revolutionanalytics.com")
pkgList <- pkgDep(pkgs, repos=revolution, type="source" )
pkgList

# Set location to store source files 
local_CRAN <- paste0(getwd(), "/local_CRAN")

# Make repo for source
makeRepo(pkgList, path = local_CRAN, repos = revolution, type = "source")

# install...
install.packages(pkgs, 
                 repos = local_CRAN, # do I really need "file:///"?
                 dependencies = TRUE, 
                 contrib.url = local_CRAN,
                 type  = "source" )

And the result is:

Installing packages into ‘C:/emacs/R/win-library/3.1’
(as ‘lib’ is unspecified)
Warning in install.packages :
  unable to access index for repository C:/Users/.../local_CRAN/src/contrib
Warning in install.packages :
  packages ‘stringr’, ‘devtools’, ‘ggplot2’, ‘dplyr’, ‘tidyr’, ‘rmarkdown’, ‘knitr’, ‘reshape2’, ‘gdata’ are not available (for R version 3.1.2)

What am I missing here?

EDIT Yes, I was missing proper use of file:///, which should be like this:

install.packages(pkgs, 
                 repos = paste0("file:///", local_CRAN),
                 type = "source")

That's moved me along heaps, it all basically works as expected now. Thanks very much. Now I just have this to look in to: fatal error: curl/curl.h: No such file or directory, which is stopping RCurl and httr from installing.

like image 662
Ben Avatar asked Mar 05 '15 18:03

Ben


People also ask

Which command is used for installing all the dependencies of the components?

npm install (in package directory, no arguments): Install the dependencies in the local node_modules folder. In global mode (ie, with -g or --global appended to the command), it installs the current package context (ie, the current working directory) as a global package.

How do you install for a package and all of the other packages on which for depend?

How to install for a package and all of the other packages on which for depends? Explanation: To install a package named for, open up R and type install. packages(“for”). To install foo and additionally install all of the other packages on which for depends, instead type install.


1 Answers

The package miniCRAN can help with this. You tell miniCRAN the list of packages you would ever want to install, it then figures out the dependencies, downloads those packages and creates a repository on your local machine that behaves like CRAN, i.e. it respects install.packages() etc.

More information:

  • Available on CRAN

  • Read the vignette

  • We are actively developing miniCRAN. Track progress and find the latest development version at github miniCRAN repository

  • See the project wiki for links to presentations, blog posts and more

To install from the local miniCRAN repository, you have two options.

  1. Firstly, you can use the URI convention file:///. e.g.

    install.packages("ggplot2", repos="file:///path/to/file/")
    
  2. Alternatively, you can configure the destination as an HTTP server and make your repository available via a URL. In this case, your local repository will look and feel exactly like a CRAN mirror, other than it only contains your desired packages.

like image 78
Andrie Avatar answered Oct 21 '22 15:10

Andrie