I can not use the subset
argument of xtabs
or aggregate
(or any function I tested, including ftable
and lm
) with mapply
. The following calls fail with the subset
argument, but they work without:
mapply(FUN = xtabs,
formula = list(~ wool,
~ wool + tension),
subset = list(breaks < 15,
breaks < 20),
MoreArgs = list(data = warpbreaks))
# Error in mapply(FUN = xtabs, formula = list(~wool, ~wool + tension), subset = list(breaks < :
# object 'breaks' not found
#
# expected result 1/2:
# wool
# A B
# 2 2
#
# expected result 2/2:
# tension
# wool L M H
# A 0 4 3
# B 2 2 5
mapply(FUN = aggregate,
formula = list(breaks ~ wool,
breaks ~ wool + tension),
subset = list(breaks < 15,
breaks < 20),
MoreArgs = list(data = warpbreaks,
FUN = length))
# Error in mapply(FUN = aggregate, formula = list(breaks ~ wool, breaks ~ :
# object 'breaks' not found
#
# expected result 1/2:
# wool breaks
# 1 A 2
# 2 B 2
#
# expected result 2/2:
# wool tension breaks
# 1 B L 2
# 2 A M 4
# 3 B M 2
# 4 A H 3
# 5 B H 5
The errors seem to be due to subset
arguments not being evaluated in the right environment. I know I can subset in the data
argument with data = warpbreaks[warpbreaks$breaks < 20, ]
as a workaround, but I am looking to improve my knowledge of R.
My questions are:
subset
arguments with mapply
? I tried with match.call
and eval.parent
, but without success so far (more details in my previous questions).formula
argument evaluated in data = warpbreaks
, but
the subset
argument is not?The short answer is that when you create a list to pass as an argument to a function, it is evaluated at the point of creation. The error you are getting is because R tries to create the list you want to pass in the calling environment.
To see this more clearly, suppose you try creating the arguments you want to pass ahead of calling mapply
:
f_list <- list(~ wool, ~ wool + tension)
d_list <- list(data = warpbreaks)
mapply(FUN = xtabs, formula = f_list, MoreArgs = d_list)
#> [[1]]
#> wool
#> A B
#> 27 27
#>
#> [[2]]
#> tension
#> wool L M H
#> A 9 9 9
#> B 9 9 9
There is no problem with creating a list of formulas, because these are not evaluated until needed, and of course warpbreaks
is accessible from the global environment, hence this call to mapply
works.
Of course, if you try to create the following list ahead of the mapply
call:
subset_list <- list(breaks < 15, breaks < 20)
Then R will tell you that breaks
isn't found.
However, if you create the list with warpbreaks
in the search path, then you won't have a problem:
subset_list <- with(warpbreaks, list(breaks < 15, breaks < 20))
subset_list
#> [[1]]
#> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> [14] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE
#> [27] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
#> [40] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
#> [53] FALSE FALSE
#>
#> [[2]]
#> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE TRUE
#> [14] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE TRUE
#> [27] FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
#> [40] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE
#> [53] TRUE FALSE
so you would think that we could just pass this to mapply
and everything would be fine, but now we get a new error:
mapply(FUN = xtabs, formula = f_list, subset = subset_list, MoreArgs = d_list)
#> Error in eval(substitute(subset), data, env) : object 'dots' not found
So why are we getting this?
The problem lies in any functions passed to mapply
that call eval
, or that themselves call a function that uses eval
.
If you look at the source code for mapply
you will see that it takes the extra arguments you have passed and puts them in a list called dots
, which it will then pass to an internal mapply
call:
mapply
#> function (FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE)
#> {
#> FUN <- match.fun(FUN)
#> dots <- list(...)
#> answer <- .Internal(mapply(FUN, dots, MoreArgs))
#> ...
If your FUN
itself calls another function that calls eval
on any of its arguments, it will therefore try to eval
the object dots
, which won't exist in the environment in which the eval
is called. This is easy to see by doing an mapply
on a match.call
wrapper:
mapply(function(x) match.call(), x = list(1))
[[1]]
(function(x) match.call())(x = dots[[1L]][[1L]])
So a minimal reproducible example of our error is
mapply(function(x) eval(substitute(x)), x = list(1))
#> Error in eval(substitute(x)) : object 'dots' not found
So what's the solution? It seems like you have already hit on a perfectly good one, that is, manually subsetting the data frame you wish to pass. Others may suggest that you explore purrr::map
to get a more elegant solution.
However, it is possible to get mapply
to do what you want, and the secret is just to modify FUN
to turn it into an anonymous wrapper of xtabs
that subsets on the fly:
mapply(FUN = function(formula, subset, data) xtabs(formula, data[subset,]),
formula = list(~ wool, ~ wool + tension),
subset = with(warpbreaks, list(breaks < 15, breaks < 20)),
MoreArgs = list(data = warpbreaks))
#> [[1]]
#> wool
#> A B
#> 2 2
#>
#> [[2]]
#> tension
#> wool L M H
#> A 0 4 3
#> B 2 2 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With