Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Masking methods in R

Tags:

r

dplyr

r-package

This question and in particular this answer brought up the following question: How can I get a warning about the masking of methods in R?

If you run the following code in a clean R session, you'll notice that loading dplyr changes the default method for lag.

lag(1:3, 1)
## [1] 1 2 3
## attr(,"tsp")
## [1] 0 2 1
require(dplyr)
lag(1:3, 1)
## [1] NA  1  2

If you attach the package dplyr, you get warnigns for several masked objects, but no warning about the default method for lag being masked. The reason is that when calling lag, the generic function from the stats package is called.

lag
## function (x, ...) 
## UseMethod("lag")
## <bytecode: 0x000000000c072188>
## <environment: namespace:stats>

And methods(lag) just tells me that there is a method lag.default. I can see that there are two methods using getAnywhere:

getAnywhere(lag.default)
## 2 differing objects matching ‘lag.default’ were found
## in the following places
## registered S3 method for lag from namespace dplyr
## namespace:dplyr
## namespace:stats
## Use [] to view one of them

But this requires that I know to check if the default lag method was changed by dplyr. Is there any way to check if methods were masked? Perhaps there is a function like this:

checkMethodMasking(dplyr)
## The following methods are masked from 'package:dplyr':
##    lag.default

NB: It is not enough to have a warning when I load dplyr with require(dplyr). The method also gets overloaded if I just load the namespace without attaching the package (e.g. I call dplyr::mutate, or even I use a function from another package that calls a dplyr function that was imported using importFrom).

like image 664
shadow Avatar asked Jun 04 '15 10:06

shadow


2 Answers

Update There is now an R package on github that tries to solve these issues. It is still far from an ideal solution, but it goes som way towards solving the issue. It currently has functions require, library and warnS3Methods.

devtools::install_github("blasern/warnS3")
require(warnS3)

# Examples
require2(dplyr)
## Loading required package: dplyr
##
## Attaching package: ‘dplyr’
##
## The following object is masked from ‘package:stats’:
##  
##  filter
##
## The following objects are masked from ‘package:base’:
##   
##  intersect, setdiff, setequal, union
## 
## The following methods are masked by 'package:dplyr':
##  
##  'lag.default' from 'package:stats'

require2(roxygen2)
## Loading required package: roxygen2
## The following methods are masked by 'package:roxygen2':
##  
##  'escape.character' from 'package:dplyr'

warnS3Methods()
## The following methods are available in multiple packages: 
##  
##  'escape.character' in packages: dplyr, roxygen2
##  'lag.default' in packages: dplyr, stats

This is only a an idea of how one can find masked S3 methods. It is by no means a perfect solution, but I guess until somebody comes up with a better idea it will at least help with debuging.

#' Get all S3 methods from a package
#' 
#' Find all S3 methods from a package
#' 
#' @param pkg can be either the name of an installed package
#' or the path of a package
getPkgS3Methods <- function(pkg){
  if (basename(pkg) == pkg) pkg <- path.package(pkg)
  ns <- parseNamespaceFile(basename(pkg), 
                           dirname(pkg), 
                           mustExist = FALSE)
  if (length(ns$S3methods) == 0) return(NULL)
  df <- cbind.data.frame(basename(pkg), ns$S3methods)
  colnames(df) <- c("package", "method", "class", "other")
  df
}

#' Get masked S3 methods
#' 
#' Finds all S3 methods that are currently available that are
#' duplicated
getMaskedS3Methods <- function(){
  paths <- as.character(gtools::loadedPackages(silent = TRUE)[, "Path"])
  lst <- lapply(paths, getPkgS3Methods)
  all_methods <- do.call(rbind, lst)
  duplicates <- 
  duplicated(all_methods[, c("method", "class")]) |
    duplicated(all_methods[, c("method", "class")], fromLast = TRUE)
  res <- all_methods[duplicates, ]
  res[order(res$method, res$class, res$package), ]
}

Called from a clean workspace (with the above functions, but no packages loaded), you can then observe the following:

getMaskedS3Methods()
## [1] package method  class   other  
## <0 rows> (or 0-length row.names)

require(dplyr)
getMaskedS3Methods()
## package method   class other
## 143   dplyr    lag default  <NA>
## 438   stats    lag default  <NA>

That just tells you that here are two lag.default methods. It does not actually tell you, which one is masking the other. It just points out potential problems.

like image 102
shadow Avatar answered Oct 08 '22 17:10

shadow


The conflicted package (see here) now offers a potential solution to this problem. With conflicted loaded, you get more explicit error messages about conflicting function names. You also can use conflict_prefer (details here) to specify which package's function you want to use by default and which should be masked.

For example, here is a recent error I got when attempting to use the function parallel from the nFactors package:

# Error: [conflicted] `parallel` found in 2 packages.
# Either pick the one you want with `::` 
# * nFactors::parallel
# * lattice::parallel
# Or declare a preference with `conflict_prefer()`
# * conflict_prefer("parallel", "nFactors")
# * conflict_prefer("parallel", "lattice")

I then added

conflict_prefer("parallel", "nFactors") 

right after the code loading my libraries at the beginning of the script to make sure that parallel would call nFactors::parallel in my code.

like image 28
ktur Avatar answered Oct 08 '22 16:10

ktur